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Unit-Cube Expression for Space-Charge 


Resistance 


By 8S. M. SZE and W. SHOCKLEY * 
(Manuscript received November 17, 1966) 


A simple analysis shows that the unit-cube conductance is a figure of 
merit in semiconductor device design theory. The unit-cube conductance, 
G, ws gwen by 2Kv, where K is the permittivity of the semiconductor and 
vq 18 the limiting drift velocity. 

The space-charge resistance, h,, , due to carrier generated under ava- 
lanche condition is derived for p-n junctions. It 1s found that for parallel- 
plane structure, R,. = 1/GN, where N is the number of unit cubes in the 
depletion region with cube edge equal to the depletion width or N = A/W* 
where W is the depletion width and A the junction area. The disturbance 
in voltage caused by the space-charge effect is given by I/GN = JW’*/G 
where I and J are the current and current density, respectively. Similar 
results are obtained for p-n junctions with coaxial-cylinder and concentric- 
sphere structures. 

For silicon, the value of G is approximately 40 umhos. The transconduct- 
ance of a silicon surface-controlled avalanche transistor in terms of the 
unit-cube expression 1s about 12.5 N wmhos. 


A simple analysis of ‘“‘avalanche resistance’ can be given for the 
limiting case in which carriers are generated at one boundary surface 
of the depletion region of a p-n junction and travel across the depletion 
region with a limiting drift velocity v,. Structures satisfying these 
conditions can be of the n° pp” form. It will be shown that the quantity 


*Stanford University and Bell Telephone Laboratories. 
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2Kv,z (where K is the permittivity of the semiconductor) is a figure of 
merit in semiconductor device design theory which limits the perform- 
ance of space-charge-limited devices. This quantity is a combination 
of ‘material constants” similar to F',v, (where F’; is ‘‘breakdown field’’) 
which limits the frequency-power performance of transistors’’’ and 
K/c (o is the conductivity) which limits the gain-bandwidth product® 
of solid-state devices. | 

Tor a structure in which the space-charge layer is bounded by parallel 
planes of area A and spacing W, it will be shown that the effective space- 
charge resistance can be interpreted as due to N unit-cube conductances 
in parallel, where the unit-cube conductance G is given by 


G = 2khov, (1) 


and N is number of unit cubes in the depletion layer with cube edge 
equal to the depletion width, or 


N = A/W’. (2) 
The space-charge resistance is then given by 
R,. = 1/NG. (3) 


For coaxial-cylinder and concentric-sphere structures, similar results 
are obtained for the R,,. The number of unit cubes (or curvilinear 
cubes), however, depends on the radius of the surface upon which ava- 
lanche occurs and the length of the cylinder (for the coaxial-cylinder 
structure). These functional dependences are derived below. 

An interesting application of the space-charge resistance and the 
unit-cube expression is given for a surface-controlled avalanche tran- 
sistor (SCAT).° 


I. PARALLEL PLANE STRUCTURE 


As represented in Fig. 1(a) the depletion layer of an n*pp” structure 
extends through the p layer with a doping of NV,, and is bounded by 
the planes at x = 0 and x = W. When the applied voltage V is equal 
to the breakdown voltage V, the electric field E(x) has its maximum 
absolute value fF’, at « = 0 and decreases to Fz — (qN.W/K) atx = W. 
This insures breakdown at x = 0. Furthermore, if gV,W/K < 0.9 Fz, 
then the field is everywhere = F',,/10, so that holes have their limiting 
drift velocity vz all across W. 

The space-charge current, J, is given by 


= vapA, (4) 
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(a) 





Fig. 1— p-n junction geometry of (a) parallel plane, (b) coaxial cylinder, 
and (ce) concentric sphere structures. 


where p is the carrier-charge density and A the area. Since &# at z = 0 
is assumed to be equal to F,, the disturbance A(x) in the electric 
field due to p is 


Ix 
Ako, (5) 





AK(z) = 


so that the disturbance in voltage caused by the carriers (1.e., the average 
field times W) is obtained by integrating A(z) 


ae tite = eh (6) 


The total voltage is thus 








qN,W* 
2K 


which verifies the interpretation of G and N. 





I | 
V=V,+av, = (FW - )+ = 4R.., (7) 


II. COAXTAL CYLINDER STRUCTURE 


Consider first that the maximum field occurs at the mner surface. 
As shown in Fig. 1(b) the depletion layer extends through the intrinsic 
region of an nip” coaxial-cylinder structure and is bounded by the 
cylinders of radii r = a and r = b. When V = V,, the electric field 
E(r) has its maximum absolute value F', at r = a, and decreases to 
Fza/b atr = b. 

The space-charge current per unit length, J/L is given by 


I 
z= 2rr pug (8) 
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so that p varies as 1/r. Integrating Poisson’s equation leads to a dis- 
turbance AK(r) in the electric field and AV, in the voltage due to p 
given by 


_ pr — a) _ | _ a) 
Be gee” ae 
and 
I sy] _ 
BURG vel E aan (2) | ye ee NY 
where | 
(, _a@_ a | 
1b (_2_ 412) Lee ema = 
oe ee a OanbL = OnbL 


A, is the area on the outer cylinder surface that corresponds to one 
unit-cube conductance G. 


A,—(b— a), for a—>b 


A, — 2b’, for a—0O. 


(12). 


Equation (10) may be interpreted as the resistance of N unit curvilinear 
cubes in parallel. These cubes are formed by intersection of equipotential 
surfaces with the orthogonal family of electric field lines. Each cube has 
a conductance (2Kv,), and the number of cubes N is given by (11). 
The area A, approaches (b — a)” when a — b, and approaches 2(b — a)’ 
when a — 0, and consequently remains finite even as the inner cylinder 
approaches a line. 

The maximum field may be caused to occur on the outer surface 
r = b by adjusting the chemical charges in the depletion layer appro- 
priately, such as a p pn” structure with the pn” junction at r = 8, 
the p*p boundary at r = a. In this case, the area A, approaches (b — a)’ 
when a — b, and approaches 20° In (b/a) when a — 0. Hence, the space- 
charge resistance has the same value as given by (10) and (11) when 
a — b, but approaches infinity as a — 0. 


IiI. CONCENTRIC SPHERE STRUCTURE 


As shown in Fig. 1(c), the depletion layer extends through the in- 
trinsic region of an n“ip~ concentric sphere structure and is bounded by 
the spheres of radii r = a andr = b. When V = Vz, the electric field 
E(r) has its maximum value F’, at the inner surface r = a, and decreases 
to Fsa’/b’ at r = b. 
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The space-charge current is given by 
I = 4nr’pva . (13) 


The quantities AH(r) and AV, are given as follows: 


Pi) — Pr — @) _ i ( _ 3 
AEG) = Kr? —s Ar Kar : , (14) 





I b a) I 
= (2Kv,) 20 (m2 = 2) = we 7 thee: (15) 
where 
b a 
1_(mb-145) 4 a8) 
N Qa ~~ Ab? 


— 2 
A,—(b— a), for a—b (17) 


A,— %, for a—0O 


and A, is quantity of area on the outer sphere surface that corresponds 
to one curvilinear unit-cube conductance G. When a is finite or a — Bb, 
(15) may be interpreted as the resistance of N unit curvilinear cubes 
in parallel, where each cube has a conductance (2Kv,) and the number 
of cubes N is given by (16). Unlike the n*ip”* coaxial-cylinder case, 
R,, of the concentric-sphere structure approaches infinite resistance as 
the inner sphere approaches a point. 

Fs can occur on the outer sphere for a p*pn” structure, for example. 
The results of A, are the same for both limiting cases as given in (17). 

An interesting application of the unit-cube expression is that for a 
surface-controlled avalanche transistor (SCAT)* with a total junction 
perimeter of P and a space-charge layer W. Because the space-charge 
resistance is finite for an n“ip* coaxial cylinder structure as the inner 
cylinder approaches a line [see (12)], it is feasible to make calculations 
for SCAT on the basis of an avalanche line source. There are N = P/W 
such unit cubes around the edge, and the total transconductance, g,, , 


= CelB) 09 


1S 
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TABLE I— Unit-CusE Expressions (W = b — a) 
Structures Parallel Coaxial Concentric 
plane cylinders spheres 
N A a—b a—0 a—>b 
(No. of unit cubes) Ws or Lb Lh And? 
W2 b Ww? 
Cube edge W W Vb W 
(cm) 
R 
8c 1 1 
conn) GN ~ (2Kva)N 
or 
Gm = (Lex) = : : (19) 
0 TR oe 


For a silicon SCAT with a device geometry of P = 1000 » and W = 
0.5 uw, there are 2000 unit cubes with cube edge 0.5 nw. The space-charge 
resistance is 12.5 ohms, and the transconductance is 25,400 wmhos. 

Another application is to calculate the voltage disturbance AV, in 
a Read diode.’ For a silicon Read diode with drift region of 10 um and 
an operating current density of 1000 amp/cm’, the value of AV;, as 
obtained from (6), is approximately 25 volts. 

A summary of the number of unit cubes and other pertinent quantities 
is presented in Table I. It has been shown that the unit-cube conductance 
(2Kv,) is a figure of merit in semiconductor device design theory. The 
unit-cube expressions are shown to be useful for calculation of the space- 
charge resistance. 
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Comparison of M-ary Modulation Systems 


By IRA JACOBS 
(Manuscript received September 22, 1966) 


Consideration of large alphabet digital communication systems ts of both 
theoretical and practical interest. Although performance bounds on optimum 
systems for the Gaussian channel are available, constructive methods for 
approaching these bounds are unknown, except in a few very special cases. 
Specific systems have been proposed and evaluated relative to these bounds, 
but exact evaluation of error probability is generally a difficult numerical 
task. It 1s of interest to consider simpler performance criteria which permit: 
comparison of various systems without extensive computation. 

An easily evaluated criterion (based on the alphabet size and minimum 
distance between signal vectors) 1s shown to yield a simple sufficient condi- 
tion for one system to be better than another (smaller error probability 
for the same energy-per-bit). The criterion is applied to orthogonal, bi- 
orthogonal, simplex, and more general permutation modulation systems. 
In addition to comparing the various systems, we consider ways of obtain- 
ing good special cases of permutation modulation. Finally, we assess a 
recently proposed system (‘‘N-orthogonal phase modulation’’) and show 
that zt 1s generally inferior to more conventional techniques. 


I. INTRODUCTION 


The choice of waveforms for communicating over the Gaussian 
additive noise channel is a classic problem in communication theory. 
Orthogonal modulation systems (i.e., digital communications in which 
the alphabet consists of orthogonal waveforms) are known to result 
in good power efficiency at the expense of poor bandwidth utilization.’ 
As the alphabet size M is increased, the energy-per-bit # required to 
achieve a given error probability P, diminishes, but the information 
rate to bandwidth ratio (#/W) diminishes even more rapidly. Bi- 
orthogonal and simplex modulation afford somewhat improved per- 
formance, but are likewise restricted to low values of R/W. 

There is considerable interest in finding large alphabet systems 
which have both good power efficiency and good bandwidth utilization. 
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Slepian® has given bounds on what can be achieved, but constructive 
techniques for approaching these bounds are generally unknown. 

Although computer evaluation is ultimately required for precise 
knowledge of error probability, it is of interest to consider simpler 
performance criteria which permit at least a qualitative comparison 
of various systems without extensive computation. It is the purpose 
of this paper to demonstrate the utility of the latter approach. 

After defining the problem more precisely in Section IJ, some well- 
known bounds on the error probability are employed in Section III 
to obtain a simple analytic criterion for comparing systems in the limit 
of low P,. In Section IV this criterion is applied to systems (PSK, 
FSK, biorthogonal, and simplex) for which extensive exact computa- 
tions are available and for which the conclusions drawn are already 
well-known. After these illustrative examples, permutation modula- 
tion* is considered in Section V and N-orthogonal phase modulation’’® 
in Section VI. It is shown that the former can yield better performance 
than conventional techniques, but that the latter is generally inferior. 
Iinally, in Section VII limits on our performance criterion, obtained 
from sphere-packing arguments, are presented. 


II. COMMUNICATION SYSTEM MODEL 


We consider an M/-ary modulation system of equienergy waveforms 
S,(t),2 = 1, --- , M, on (0,7), having the correlation matrix 


pie if [ 8s. at (1) 


E, is the energy of each waveform so that p;; = 1, and —1 S p,, $ 1. 
It is conventional®’’ to define a normalized information rate, 
(2 log. M7)/n, where n S M is the rank of the correlation matrix (di- 
mensionality of the signal space). We choose to call this normalized 
rate the ‘information to bandwidth ratio, R/W”’ motivated by the 


relations 


WA Wen @) 


where the second equality follows if we set n = 27W, which is at 
least partially justified for large n by the work of Pollak and Landau.* 
_ For our purposes, the right-hand side of (2) may be considered as the 
definition of R/W. 

It will be assumed that in addition to p;; = 1 that each row of the 
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correlation matrix can be written as a permutation of the first row. 
Considering the waveforms as vectors in an n-dimensional linear vector 
space, this means that each waveform sees an identical environment 
of neighboring waveforms. This restriction is a desirable one if it is 
desired to transmit each waveform with equal a prior? probability. 
The restriction is satisfied by the various modulation systems mentioned 
in the introduction.* Slepian’ has termed such systems ‘‘group codes 
for the Gaussian channel.”’ 

It is assumed that the receiver observes a waveform z(t) on the 
interval (0,7') 


a(t) = S.Ct) + n(d), (3) 


where n(¢) is a sample function from a white Gaussian noise process 
of spectral density N, ; 1.e., 


(n(in(t)) = S* ae — 0), (4) 


On the basis of this observation we wish to decide with minimum prob- 
ability of error (P,) which of the 7 waveforms was transmitted. The 
optimum (minimum P,) receiver is known’’ to consist of 14 matched 
filters which give 


2 = [SiO dt = Bo + 2), (5) 
where 


e = if [ n(t)S;(2) dt (6) 


and decision that the kth waveform was transmitted is made if 2, > 2; 
for allj # k;i.e., the decision is made on the basis of the largest matched 
filter output. 

From (6), the x; are zero-mean Gaussian variates with covariance 


(ae) = Ba ff dear anit SOS) 


N, | | 
ame Df. Pik : —@) 


* The only commonly employed M-ary system (known to the author) which does 
not satisfy this restriction is M-level amplitude modulation. 
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The error probability of this system is given by* 
-1-| ef dey ++ dew plas, -** 24), (8) 


where p(a,, °*- , Yar) is the multi-variate zero-mean Gaussian dis- 
tribution with covariance given by (7), and the region of integration 
Q,; 1s defined by the condition 


Q; = region in which 1 + x; > p,; + 2; for all 7 ¥ 7. 


Clearly P, is a function of MZ parameters: ,/N., pi2, Pis) *** » Pia 
the first of which is a signal-to-noise ratio, the remainder of which 
describe the correlation properties of the modulation system. 

Landau and Slepian’* have proved the long-conjectured result that 
P, is minimized for a given M (but n unrestricted) by the simplex 
configuration in which the correlation matrix has the formy 


1 pee 

be simplex. (9) 
ee ee 
M—1 : 


The rank of this matrix ism = M — 1 so that 


2 log, M M 


(R./W) simplex = M i 1 


(10) 


For this case the expression for P, may be reduced to a single integral’® 
and numerical results are readily obtained.” 

Weber’* has derived locally optimum configurations when M/2 < 
n <= M — 1. Forn = M/2 a local optimum is the biorthogonal con- 
figuration in which the signal vectors are located along the coordinate 
axes (+ and —) of the n-dimensional vector space such that 


| lwe=j 
oi, = 4-1 t=j-—(-D biorthogonal. (11) 
| 0 tAge ely 


* Actually this is the error probability assuming the 7th signal is transmitted. 
However, under the assumption of equal a priori transmission of all signals, and the 
permutation property assumed for the correlation matrix, this probability is inde- 
pendent of z and is equal to the system probability of error. 

+ The “local optimality” of the simplex configuration (viz., that P, has a local 
minimum) had been proved previously by Balakrishnan.” 
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The rank of this Af & M matrix is M/2 so that 


i) _ 4 log, M 
e biorthogonal 7 M (12) 


In this case P, may also be expressed as a single integral which is 
readily evaluated by machine techniques. Although for a given value 
of M, biorthogonal modulation requires slightly more energy-per-bit 
to achieve a given P, than simplex,* it is noted that (for large M) 
R/W for biorthogonal is essentially twice that of simplex. Further- 
more, for biorthogonal half of the waveforms are the negatives of the 
remaining half; consequently, M/2 rather than M matched filters are 
required. For these reasons biorthogonal is generally preferred to sim- 
plex, and indeed has been employed for deep-space communications.” 

The disadvantage of both simplex and biothogonal modulation is 
that good power efficiency is associated with large values of M (as it 
must be for any modulation system) which from (10) and (12) imply 
small values of R/W. Weber’s** results indicate locally optimum sys- 
tems with R/W between simplex and biorthogonal, but these are then 
also restricted to relatively small R/W. 

Optimum systems (in the sense of minimum P,) are not known for 
n < M/2. However, bounds on the error probability of optimum sys- 
tems have been obtained’ and evaluated.* (The upper bound is ob- 
tained by random coding arguments, and the lower bound by sphere 
packing arguments.) These bounds are extremely useful in assessing 
the performance of specific systems; however, to do so involves ex- 
plicit evaluation of P, for the specific systems of interest. This is at 
best a difficult numerical task. Furthermore, we may find in comparing 
two systems that one is better if we are interested in P, } 10%, 
whereas the reverse is true when P, = 10-°. Also, in comparing sys- 
tems with different values of M it may be unrealistic to compare P.,, 
since P, is the word error probability, and the systems contain a dif- 
ferent number of bits per word. Comparison on the basis of bit error 
probability involves a difficult conversion from word to bit error 
probability which involve coding arguments separate from the modula- 
tion system performance.** For all of these reasons it 1s desirable to 
find a simpler criterion than P, which permits at least a gross com- 
parison of modulation systems. : 

* Tf the comparison is made for a fixed R/W rather than M then biorthogonal 


requires less energy per bit. The simple unqualified statement that simplex is 
the optimum modulation system is misleading. 
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III. BOUNDS ON ERROR PROBABILITY 


One approach to comparing modulation systems is to obtain lower 
and upper bounds on the true error probability* 


Pea PSs, (13) 


and to say that system 1 is better than system 2 if Py, < Pr. 

If two systems are close in performance, the above procedure may 
not enable us to determine which is better unless the bounds are close. 
On the other hand, close bounds may be difficult to evaluate and may 
not lead to a simple performance criterion. We adopt the viewpoint 
here that it is desirable to have bounds, which although quite loose, 
lead to a simple sufficient condition for determining when one system 
is better than another. 

Let 


p = max p;; = Max pj; . (14) 
isa j>1 | 


That is, p is the largest non-diagonal entry of the correlation matrix. 
It is readily established thati 











(-Ja-9)sp.sar-nel-Jea-9), as 
where 
B(x) = iz [ey em (2), (16) 


The lower bound is obtained by observing that P, for an M-ary system 
can be no less than that of the binary system containing nearest 
neighbor waveforms. The upper bound follows from 


P, = 3 a(- Vie tS p:i) < (M — a(— Nia = ») (17) 


where the first inequality in (17) 1s a consequence of the symmetry 
property of the system and the fact that the probability of a union of 
events is less than the sum of the probabilities of the events. The 








* We consider here bounds on word error probability, which, however, may be 
easily converted to bounds on bit error probability. For example, if a word is in 
error at least one bit is in error, and at most all bits are in error. Hence, 
P./log:M and P. are lower and upper bounds on the bit error probability. 

‘~ These bounds are generally well known and appear widely in the literature; 
eg., Refs. 7, 17, 18, 19. Also, as noted in the previous footnote, these bounds on 
word error probability are readily converted into bounds on bit error probability. 
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second inequality in (17) follows simply by observing that a sum of 
(MZ — 1) terms is no greater than (M — 1) times the largest term. 

In comparing modulation systems with different alphabet size it is 
more appropriate to consider the energy per bit # rather than the 
signal energy E,, where 


E = E,/\og, M. | (18) 


Indeed, the parameter #/N, is an appropriate measure of the power 
efficiency of a modulation system. The Shannon channel capacity 
formula requires that E/N, > log.2 to achieve arbitrarily small P., 
conventional systems generally require values of E/N, at least 4 times 
the Shannon minimum.? | 

In terms of the parameter L/N, the error probability bounds may 
be rewritten 


o(- . K) <P,<(M- 1)a(- =x) | (19) 


k= (1 — p) log, M. (20) 
We will say that system 1 is “better” than system 2 if . 


Gi, Emadee at 


Several conclusions are apparent from (21). 





where 





(2) In the limit of large H/N,, Ay > Ke is sufficient to ensure that 
system 1 is better than system 2. (If K, > Ke we will say that sys- 
tem 1 is “asymptotically better” than system 2.) 

(22) If system I is asymptotically better than system 2, then there 
exists a valuc of H/N, above which system 1 is better than system 2. 
Below this value of H/N, our formalism is generally inadequate to 
determine which system is better. (The critical value of E/N, may 
be obtained by replacing the inequality in (21) by an equality.) 

(22) A binary system that is asymptotically better than an M-ary 
system is always better than the M-ary system. 


Thus, we can always determine quite simply which of two systems is 
asymptotically better, and may, in many special cases, be able to make 
comparisons at specific H/N, of interest. 

It should be emphasized that the above comparison is on the basis 
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of the P, obtained with the two systems when operated at the same 
average power and information rate. To complete the comparison, the 
bandwidth requirements of the two systems should also be considered. 
Thus, the parameter R/W, as well as K should be used in comparing 
systems. 

In the following sections of this paper specific systems will be con- 
sidered and represented by points on a K, R/W plot. This will enable 
an immediate comparison of the asymptotic performance of systems 
having the same R/W. It should be noted that Gilbert*” used a simi- 
lar plot in his 1952 paper which addressed the same subject con- 
sidered here. Gilbert employed a (SNR, R/W) plot in which the effec- 
tive signal-to-noise ratio (SNR) was obtained for a given P, by using 
the upper bound in (19). Since the SNR is related to our 1/K, better 
systems correspond to smaller SNR. Our purpose in writing this paper 
is not to argue that our plot is a better way to present the results 
than Gilbert’s. (Indeed, since in general, P, is much closer to the upper 
than to the lower bound, his method of comparison is somewhat bet- 
ter, although somewhat less convenient to use.) Our purpose rather is 
to resurrect these old methods which have been largely discarded 
since the advent of high-speed computation, and to illustrate their 
applicability to recently proposed modulation systems. 


IV. PHASE, FREQUENCY, BIORTHOGONAL AND SIMPLEX MODULATIONS 


4.1 Phase-Shift Modulation 


For M phasors uniformly spaced on the unit circle, p = cos 27/M. 
Therefore, 


T 


M 


Note that K = 2 for both M = 2 and M = 4* and falls off thereafter. 
Since the dimensionality of the signal space is n = 1 for M = 2 and 
n= 2 for M > 2, it follows that h/W is given by 


K = 2 log, M sin’ (22) 


R 2 for M = 2, 
vo : (23) 
log, M for M > 2. 
* K is maximized (for integer M) when M = 3. In practice, it is generally 


desirable to consider only those values of M which are integer powers of 2 (ie., 
each symbol conveys an integer number of bits). We shall restrict our numerical 
examples to such cases. 
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TABLE [—PHASE-Suirt MopULATION 








M K R/W 
2 2 2 
4 2 2 
8 0.88 3 
16 0.30 4 
o2 0.098 5) 
64 0.030 6 


Table I lists the K and R/W values for phase-shift modulation, and 
these are denoted by dots in Fig. 1. It is apparent that M' = 2 and 
M = 4 are asymptotically better than the higher-order systems, and 
from our previous results this implies that the binary system is always 
better than the general M-ary case with M > 4.* Recall that we are 


PSK 
ORTHOGONAL 
BIORTHOGONAL 
SIMPLEX 


e 
oO 
0 
A 





Fig. 1— (K,R/IW) plot for phase-shift, orthogonal, biorthogonal, and simplex 
modulations. 
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consistently using the term “better” to mean smaller P, for a given 
E/N,. Large alphabet phase modulation may still be desirable because 
of the larger R/W. 


4.2 Frequency Shift (Orthogonal) Modulation 
For M orthogonal signals (e.g., frequency-shifted signals with es- 
sentially non-overlapping spectra), p = 0 and 
K = log. M. (24) 
The dimensionality of the signal space is the number of orthogonal 
vectors, nm = M, so that 


R _ 2 log, M 
Ww M (25) 


Table II lists the K and R/W values for orthogonal modulation, and 
these are denoted by circles in Fig. 1. Larger values of 1/7 correspond to 


TABLE IT —OrTHOGONAL MODULATION 


M K R/W 

2 1 1 

4 2 1 

8 3 3/4 
16 4 1/2 
32 5 5/16 
64 6 3/16 


systems which are asymptotically better, at the expense, however, of 
smaller values of R/W. It is clear that binary orthogonal is inferior 
to binary and quarternary PSK both in terms of a smaller K and 
smaller h/W.7 


4.3 Buorthogonal Modulation 

A biorthogonal system consists of 11/2 orthogonal waveforms and 
their negatives. The maximum correlation coefficient 1s p = 0 for M 2 
4, but p = —1 for M = 2. Therefore, 


K = 2 for M = 2, 
log, for M24 (M even). 


* This conclusion is confirmed by the exact calculations of P. for M-ary PSK 
by C. R. Cahn.?° 

+ Binary FSK may still be employed, of course, for simplicity reasons or 
because the channel phase coherence may not be consistent with phase-shift 
modulation. 


(26) 
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Since n = M/2, 


R _ 4 log, M _ 

Vv mM | on 
Table III lists the K and R/W values for biorthogonal modulation, 
and these are denoted by []’s in Fig. 1. Note that M = 2 and M = 4 
biorthogonal are equivalent, respectively, to binary and quarternary 
PSK. 

Clearly, for fixed R/W, biorthogonal is asymptotically better than 
orthogonal. For example, consider M = 4 orthogonal and M = 16 
biorthogonal, both of which have R/W = 1. From (21), the biortho- 
gonal system is better than the orthogonal system for all E/N, > 2.8, 
which corresponds to all P, of practical interest. (P, < 3(10)-*). 


4.4 Simplex Modulation 


In simplex modulation, the M code vectors form a regular simplex 
in M — 1 dimensions. (All vectors are equally spaced from all other 
vectors. This corresponds to an equilateral triangle in two dimensions, 
and a regular tetrahedron in three dimensions.) All correlation coef- 
ficients are equal and are given by?%??3 » = —1/(M — 1). Therefore, 


M : 
k= M_—1 log. M. (28) 
Since n = M —1, 


Rk _ 2 log, M 

WwW M—1 29) 
Comparison of (28), (29) with (24), (25) indicates that for large M 
simplex modulation is essentially identical to orthogonal modulation. 
Table IV lists the K and R/W values for simplex modulation, and 
these are denoted by /\’s in Fig. 1. A quick glance at Fig. 1 indicates 


TABLE IJTI— BiorTHOGONAL MODULATION 


M K R/W 
2 2 2 
4 2 2 
8 3 3/2 

16 4 1 

32 5 5/8 

64 6 3/8 
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that depending on the R/W of interest, biorthogonal or PSK modula- 
tion offers the best asymptotic performance of the systems considered 
so far. (The dashed line in Fig. 1 is drawn through these “best” 
points.) Note that although simplex provides the largest K for a fixed 
value of M, it does not do so for fixed R/W.* 


TABLE ITV —SIMPpLEX MoOpDULATION 


M K R/W 
2 2 2 

4 2.67 1.33 
8 3.43 0.86 
16 4.26 0.53 
o2 5.16 0.32 
64 6.10 0.19 


V. PERMUTATION MODULATION 


Slepian* has recently described an exceedingly general modulation 
system (permutation modulation) for which all of the systems con- 
sidered in the previous section are special cases. The optimum de- 
modulation algorithm is particularly simple, but the actual evaluation 
of P,, and the finding of good special cases is somewhat more complex. 
We restrict ourselves here to a special subclass of permutation modula- 
tion. This subclass is suggested both as the simplest generalization of 
biorthogonal systems, and because perusal of Slepian’s results indicate 
that systems taken from this subclass are amongst the better of the 
moderate-sized alphabet examples which he considers. 

Following Slepian we define an (n,m) permutation modulation sys- 
tem as follows. The time interval T is divided into n subintervals (n = 
2TW). The first waveform of the alphabet consists of a signal with 
amplitude unity in the first m subintervals (m < n), and zero ampli- 
tude in the remaining subintervals. The remainder of the waveforms 
consist of all possible permutations of the subintervals, allowing also 
all combinations of plus and minus amplitudes. For example, the 
(3,2) system contains twelve waveforms which we may represent as 


1,1,0), ,—1,0), (—1,1,0), (—1,—1,0), 
(1,0,1), 2,0,—1), (—1,0,1), (—1,0,-—), 


(0,1,1), (0,1,—1), 0,—1,1), 0,—1,—1). 


* For the special case M = 2, simplex, biorthogonal and PSK are all 
equivalent. 
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In general, it is easily seen that the alphabet size M is given by 


M = a-(” | : (30) 


It is also noted that the special case (n,1) corresponds to biorthogonal 
modulation.* 

This (n,m) modulation clearly satisfies the symmetry requirements 
of our theory. All members of the alphabet have equal energy? and the 
correlation matrix has the desired permutation property. It is readily 
seen that the maximum correlation coefficient is given by 








m— 1 
pre (31) 
Thus, 
K = (log, M)(1 — p) (32) 
=[+ Z logs te ) 
m m 
so that (n,m) modulation always achieves K > 1. Also 
fk _ 2 log, M 
W n (33) 
Seo 
n 


Equations (32) and (83) suggest that (n,m) modulation may achieve 
both large values of K and large R/W, which was not possible with 
any of the systems described in the previous section. 


5.1 (n,2) Modulation 


Since m = 1 leads to biorthogonal modulation which has many de- 
sirable properties, it is natural to look next at the special case m = 2. 
From (32) and (33) it follows that for (n,2), 


kK = 3[1 + log.nm —1)]) (nm 2 3) (34) 
and 
R 4K 
Won on 


*TIn Slepian’s terminology, the (n,m) modulation described here is a variant 
II system in which m1 =n — m, m2=™ and mi = 0, pe = 

+ With the normalization employed above, the signal energy is m. However, 
all code words may be multiplied by a constant to achieve any desired £E,. 
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values of the K and R/W are given in Table V and are plotted as M’s 
in Fig. 2. (For reference, Fig. 2 also contains the biorthogonal and 
PSK results from Fig. 1.) Thus, similar to biorthogonal, as n becomes 
large KC increases but R/W decreases. It is seen from Fig. 2 that (n,2) 
modulation gives better performance (larger K for a given R/W) than 
biorthogonal or PSK.* 


5.2 (2m,m) Modulation 


(2,1) corresponds to M = 4 biorthogonal, which from our earlier 
results gives K = 2, R/W = 2. It is seen from Table V that (4,2) 
gives K = 2.30, R/W = 2.30 which corresponds to both better 
asymptotic performance and better bandwidth utilization. It is ap- 


TABLE V — (n,2) MopULATION 


n M = 2n(n — 1) K kR/W 





5) 12 1.79 2.38 
4 24. 2.30 2.30 
5 40 2.66 2.13 
6 60 2.95 1.97 
7 84 3.19 1.82 


parent from (33) that whenever n = 2m, R/W = K, and an im- 
mediate question 1s how large can we make these two quantities. 
With n = 2m, it follows from (32) that 


1 2m 
K = 1+ 4 tog, ( ) (36) 


m 


Use of Stirling’s approximation when m > 1 gives 


2m\ 1 re 
so that for large m, K — 3. It is easily shown that K increases mono- 
tonically towards this asymptotic value as m is increased. Thus, 
(2m, m) modulation does not permit attainment of arbitrarily large 
values of K, and hence cannot attain arbitrarily low P, with finite 
H/N,. This is consistent with Slepian’s statement* that permutation 
modulation cannot approach channel capacity arbitrarily closely at 
non-zero h/W. 





* This is, of course, achieved only at the expense of a larger alphabet size. It 
may also be noted from Table V that the alphabet size is not generally, a power 
of 2 which may also be a practical disadvantage. 
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e PSK 
0 BIORTHOGONAL 
— (km,m) m>>1 
(K-1) M>>1 





3 
R/W 
Fig. 2— (K,R/VW) plot for permutation modulation. 


5.3 (km,m) Modulation (k > 1) 


As an immediate generalization of the above, consider the more 
general case n = km where k > 1.* Then, from (83) 


R 2 


wo oP) 
and from (382) 
1 km 
K = 1+ A tog, (*"). (39) 


Again using Stirling’s approximation for large m, (assuming also that 
(k —1) m> 1) 


va = T= Vi = 1 (-)" cease ee (40) 


* Of course k should be chosen so that km is an integer. 
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Thus, in the limit of large m, for fixed k > 1, 


K—-1+ k log, (+ Z .) + log, (k — 1). (41) 


For large k, the right-hand-side of (41) increases as log k; however, 
as seen from (88) R/W diminishes as k-. The locus of K, R/W values 
obtained with different values of k (but large m so that the approxima- 
tion (41) applies) is shown by the solid curve in Fig. 2. In the limit 
as k > 1 (but m always sufficiently large such that (kK — 1) m > 1), 
K > 1 and R/W — 2. As k increases, both R/W and K increase until 
k =~ 1.5 at which point R/W ~% 3.2 and K = 2.8. Further increases in 
k result in a reduction in R/W but continued increase in K. 

The above results indicate that (n,m) codes can be found with 
F/W as large as octary PSK (R/W = 8) and with a better 
asymptotic performance. 


VI. COMBINED PHASE-SHIFT AND ORTHOGONAL MODULATION 


In the previous examples we have compared by approximate meth- 
ods modulation systems which have already been analyzed exactly. 
Although perhaps additional insight into the relative performance of 
these systems has been obtained, many of our conclusions may be 
inferred from existing exact calculations. We now wish to consider a 
new system, recently proposed by Reed and Scholtz,°:® which (to our 
knowledge) has not yet been evaluated numerically. 

Consider an alphabet M divided into M; groups, each group con- 
taining M, members. Thus, 


M = M,M, e (42) 


The different groups may be considered to be sufficiently separated in 
frequency so that waveforms from different groups are orthogonal. 
Within a group the waveforms have the correlation properties as- 
sociated with phase-shift modulation. Thus for M, 2 4 the maximum 
correlation coefficient is p = cos (27/M,), and 


K = 2sin’® —— 7, (108 M, + log. M,). (43) 
Since each group requires a two-dimensional sub-space 


n= 2M, . (44) 
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Thus, 


log, M, , log, M, 

M, M, 
In the special case of M; = 1 it is apparent that this system reduces to 
simple phase-shift modulation (Section 4.1). In the special case of 
M, = 4 it reduces to the biorthogonal case (Section 4.38). A question 
of interest then is whether choices of M, > 4, M; > 1 lead to better 
performance than either phase-shift or biorthogonal modulation.* 

In Fig. 3 the K and R/W values (obtained from 43 and 45) are 


k/W = (45) 





Fig. 3—(K,R/W) plot for combined phase-shift and orthogonal modulation. 


shown for the combined phase orthogonal modulation. The solid curves 
are for constant values of M; (noted on the curve); the uppermost 
point on each such curve corresponds to M, = 4, and each lower point 
corresponds to M, increased by a factor of two. The dashed curve 
goes through the M, = 4 points (biorthogonal). It is apparent from 
this figure that in this class of systems, for R/W = 2, the M, = 4 
biorthogonal systems give the largest value of K. For R/W = 2, the 
M; = 1 phase-shift systems give the largest value of K. Thus, in terms 

* Reed and Scholtz5.6 are concerned largely with an algebraic method of 


generating waveforms with the above correlation properties, rather than in a 
comparative evaluation of performance. 
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of asymptotic performance, choice of M, > 4, M; > 1 always gives 
poorer performance than systems which achieve the same R/W with 
either M, = 4 (biorthogonal) or M,; = 1 (simple phase shift). 

For example, consider M; = 2, M, = 8. This yields R/W = 2 and 
K = 1.17. However, R/W = 2 is also achieved with M; = 1, M, = 4 
(quaternary PSK), and for this case K = 2. From (21) we can con- 
clude that the latter system is better than the former for all #/N, > 
2.5, which includes all P, of interest. The significance of these results 
is that we can make this comparison with only a simple slide-rule 
calculation. 

In the above comparison we considered only M, = 4. The case of 
M, = 1, M; > 1 is the orthogonal modulation previously considered. 
The case of M, = 2, M; > 1 gives the same performance as biorthog- 
onal but achieves only 4 the R/W and consequently is of little in- 
terest. The case of M, = 3, M; > 1 consists of orthogonal combina- 
tions of two-dimensional simplexes (equilateral triangles). Reed and 
Scholtz® conjecture that for M = 3M,;, the three-phase orthogonal 
system gives a smaller P, than any other collection of 3M; signal func- 
tions in a space of dimensionality 247;. Although this conjecture may 
well be true, we wish to point out that if the comparison is made on 
the basis of fixed R/W (rather than fixed M) then biorthogonal is 
asymptotically better than three-phase orthogonal. One way of seeing 
this is by noting that three-phase orthogonal has the same K but 
smaller R/W than the four-phase (biorthogonal) system of the same 
dimensionality. To increase the R/W of the three-phase system re- 
quires a reduction in K which makes it asymptotically poorer than 
the corresponding biorthogonal system. 


VII. BOUNDS ON K 


It has been shown that the (K,R/W) plot provides a useful tech- 
nique for comparing the performance of various modulation systems. 
Although our main concern here is in the comparison of specific sys- 
tems, it is still natural to ask whether there are bounds on what may 
be achieved in the (K,R/W) plane. 

It is apparent from the definition of 


k = (1 — p) log, M | (46) 


that if no constraint is placed on alphabet size or signal space 
dimensionality, K can, in principle, be made arbitrarily large for any 
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R/W. This corresponds to the fact that the Shannon channel capacity 
formula implies that arbitrarily small P, may be achieved at all 
(finite) #/W with finite H/N>. 

If M is held fixed but n is unconstrained, then the maximum J is 
achieved by the simplex modulation’? (Section 4.4) for which case 
Kk = [M/(M — 1)] logeM and R/W = (2 loge’T)/M — 1. 

Perhaps of more practical interest is the opposite case where the 
signal space dimensionality n is fixed, but M is unconstrained. Here, 
sphere-packing arguments may be used to show that? 


M$ Tonal s+, 4), (47) 
where I,,(p,q) is the incomplete beta-function which is extensively 
tabulated.” Thus, for a given p-and n, an upper bound to M may be 
calculated from (47). Since I,(p,q) is monotonic increasing in 2, this 
also gives a lower bound on p for fixed M and n. Considered in this 
latter context we can then determine an upper bound on K with which 
is associated a given value of R/W = (2/n) logsM. This upper bound, 
K,, is plotted in Fig. 4 as a function of R/W for n = 5 and n = 10. 
Both curves indicate that K, achieves a maximum value. This is un- 
derstandable since for large R/W, 1 — p decreases more rapidly than 
log,M increases. On the other hand, as R/W decreases, logeM keeps 
decreasing, whereas 1 — p is of course always less than 2. Thus, it is 
not surprising that there exists an R/W at which K, is a maximum. 

It should be noted, however, that K, is an upper bound which likely 
cannot be achieved. For example, when R/W = (2/n) loge2n, cor- 
responding to M = 2n, the optimum configuration is widely con- 
jectured to be the biorthogonal case.2* The corresponding K and R/W 
values for biorthogonal with n = 10 and n = 5 are shown by the 
points marked (10,1) and (5,1) on the dashed curves of Fig. 4. These 
points lie well below the upper bounds. 

Biorthogonal is a special case (m = 1) of the (n,m) permutation 
modulation considered in Section V. Fig. 4 (dashed curves) shows the 
K and &/W values for the (10,m) and (5,m) cases. As must be, 
these curves lie below the upper bounds given by the solid curves. 

Finally, we note from Fig. 4 that (n,m) permutation modulation 
possesses the interesting feature that as m is increased (for a fixed n) 
a maximum f/W is achieved. Both the properties of the maxima of 
K, and the maxima of the R/W of (n,m) modulation are probably 
worthy of further study. 
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Tig. 4— Bounds on (K,R/W) for fixed n and comparison with permutation 
modulation. 


VIII. CONCLUSION 


The main conclusion to be drawn is that the K, R/W plot provides 
an exceedingly useful technique for comparing modulation systems. 
We have restricted ourselves to modulation systems in which the 
signal alphabet consists of equienergy waveforms for which all rows 
of the correlation matrix are permutations of a given row. (Geo- 
metrically, the alphabet consists of M points on the surface of an n- 
dimensional sphere such that all points see exactly the same environ- 
ment.) This class of systems, although somewhat limited, is sufficiently 
broad to cover most systems of theoretical and practical interest. 
Given two systems in this class such that K, > Ko; then in the limit 
of large E/N, (low P,) Per < Peo for the same H/N,. Furthermore, 
we have obtained a simple sufficient condition on the E/N, above 
which this inequality is valid. These results are in reality not new. 
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They are implicit in the results of Shannon’ and in many other works.*® 
What is perhaps new is that many interesting results and comparisons 
can be obtained by such simple techniques. 

Considerably more precise comparisons can of course be made by 
exact computation of P, rather than by comparison of K. The latter 
procedure however is considerably quicker and allows ready con- 
sideration of entire classes of systems (e.g., the (n,m) permutation 
modulation and the combined phase-shift orthogonal modulations 
considered in the previous sections). The comparisons discussed here 
are not meant to supplant exact evaluation, but rather as a coarse sieve 
for delineating systems worthy of more extensive calculation. 
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Combinatorial Solution to the Problem of 
Optimal Routing in Progressive Gradings 


By V. E. BENES 
(Manuscript received June 10, 1966) 


The grading or graded multiple proposed by E. A. Gray ts a certain 
kind of one-stage, two-sided, partial access telephone connecting network 
for switching customers’ lines to trunks all having the same destination. 
Its essential feature is that traffic from lines not having identical access 
patterns can be offered to a common trunk, and so pooled. In a progressive 
grading the trunk groups are partially ordered in a hierarchy, 7.e., some 
provide primary routes, others function as secondary routes which handle 
traffic overflowing from primary routes, as well as originating traffic, etc., 
up to final routes. 

A call which zs using an overflow or “‘later’”’ trunk when it could be using 
a primary or ‘earlier’ group 1s said to make a “hole in the multiple’. 
It was recognized early in the development of gradings that such holes were 
undesirable. 

The problem of optimal routing in telephone networks, considered in 
general in the author’s earlier work, 1s here specialized to progressive grad- 
ings. It had been shown that for networks with certain combinatorial 
properties the optimal choices of routes for accepted calls (so as to minimize 
the loss under perfect information) could be described in a simple and in- 
tuitive way in terms of these properties. The present paper gives a proof 
that all progressive gradings have such a combinatorial property, associated 
with the hierarchical nature of the grading. The optimal policy for routing 
accepted calls 1s related to the phenomenon of “‘holes in the multiple’, 
and can be paraphrased in the traditional telephone terminology thus: 
filling a hole in the multiple is preferable to using a final route, and filling 
an earlier hole is preferable to filling a later one. 


I, INTRODUCTION 


The term ‘hierarchical’ has often been used to describe connecting 
networks in which the possible routes for a call are ordered, with the 
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order determining the routing decisions in that the earlier routes are 
hunted over before the later. The Bell System’s toll network is often 
cited as an example of a hierarchical network. Recently, J. H. Weber 
has used the word ‘hierarchical’ in a more technical sense to describe 
trunking networks “. .. in which at least some of the trunk groups are 
high usage; i.e., traffic which is not carried can be overflowed to other 
groups, at least some of which are finals, which have no alternate 
route.’ 

In this paper, we consider some ways in which the concept of a 
hierarchy of routes is relevant to the problem of optimal routing as 
formulated in previous work.? Naturally, such a hierarchy can be 
relevant to routing only if it is in a suitable way related to those 
combinatorial properties of the network which distinguish the ‘good’ 
from the ‘poor’ ways of completing calls. (Examples of such properties 
were given in Ref. 2.) It shall be shown that natural hierarchies as- 
sociated with certain gradings hold the key to the routing problem in 
these one-stage networks. 

It is now known? that if a network possesses one of certain com- 
binatorial properties, then this property can be used to describe in a 
simple way the optimal choices of routes for accepted calls so as to 
minimize the loss under perfect information. The next natural ques- 
tion is, then, what networks possess some of these properties? We 
shall prove that the members of an important subclass of connecting 
networks, that of progressive gradings, all have a combinatorial prop- 
erty similar to the strongest of those of Ref. 2; this property is as- 
sociated with a natural hierarchy of routes, and leads to a solution of 
the routing problem for accepted calls. 


II, GRADINGS 


We first discuss and clarify some of the usage and terminology as- 
sociated with gradings. Since about* 1905 the noun ‘grading’ and the 
adjective ‘graded’ have been used in telephony to describe a certain 
kind of one-stage two-sided network for connecting customers’ lines 
to trunks all having the same destination. Roughly speaking, a grading 
has this property: some trunk is such that two lines have access to 
it which do not have access to the same trunks. The essential feature 
is that traffic from distinguishable lines (i.e., ones not having identical 
access patterns) can be offered to a common trunk. 


*. A. Gray proposed the “graded multiple” in 1905, and was granted a 
patent for it (No. 1002388) in 1911. 
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It appears, though, that the word ‘grading’ has been used in a wider 
sense in Europe than in the United States. In particular, the American 
usage® implies a certain order in the pattern of access that the lines 
have to the trunks, whereas in the European meaning this implication 
is absent. The order implicit in the American usage amounts to this: 
the trunks are partitioned into groups which are so partially ordered 
that no group has more than one successor in the ordering; a line that 
has access to one group has access to all groups that follow it in the 
ordering. (This ordering usually determines the order in which the 
lines hunt over the trunks.) Thus, e.g., a trunk group with no predeces- 
sors in the ordering can be used by exactly one group of lines, for 
which it is the “primary” route. In one European sense of “grading,” 
however, a trunk group which is the first one hunted over by one line 
group may be the nth one (n > 1) hunted over by some other line 
group.* The distinction drawn here is of some importance, inasmuch 
as the order structure implicit in the American usage gives rise to a 
natural hierarchy of routes that is directly relevant to routing, whereas 
in the more general case this hierarchy is not necessarily present. 

Recently, in an effort to establish a uniform terminology, the 
nomenclature committee of the International Teletraffic Congress de- 
cided® that the terms ‘grading’ and ‘graded multiple’ should be in- 
terchangeable, and the structures described in R. I. Wilkinson’s paper? 
as graded multiples be called, more specifically, progressive graded 
multiples or progressive gradings, the word ‘progressive’ here re- 
ferrmg to the order structure we have described as characteristic of 
the American usage. The usage recommended by this committee is 
adopted herein. 

Since the present work can be viewed as a continuation of Ref. 2, 
we take the liberty of assuming familiarity with the notations and 
concepts used there, and we include only occasional reminders of the 
meanings of important notions. 


Ill. HIERARCHIES OF ROUTES 


It will be convenient to have a notation for routes. A route r for a 
eall c is just a way in which c can be put up or realized in a network », 
and so it can be identified with the state in which the only call in pro- 
egress is c using route r. Thus, a route for c is any element of y~*(c).* 
We use the variables g and r (over the set L, of states with one call in 
progress) to denote routes. 


* We recall that if xis a state, y(z)is the assignment of inlets to outlets realized by z. 
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By a hierarchy of routes we mean a partial ordering ~ contained in 
Cae ao le 


It is apparent that > can hold only between alternative routes for 
the same call. (Of course, not every hierarchy of routes is relevant to 
routing; only those that have a suitable relation to the ways in which 
calls in progress block new calls will be of interest. The problem is 
to clarify the meaning of ‘suitable’.) 

A hierarchy of routes, being a partial ordering of the states with one 
call in progress, can be extended to, or can induce, a partial ordering 
of the whole set S of states in several natural ways. Since > can hold 
only between alternative routes for the same call, it is reasonable to 
confine attention to extensions which hold only between states that are 
equivalent in the sense of ~ in Ref. 2, i.e., are (possibly) different ways 
of realizing the same assignment. An obvious first candidate for such 
an extension is given by the condition 


x~y and rSuaqsyr~q imply rDgq. (1) 
However, we eschew this definition in favor of a stronger one: let us set 


x Dy = « is reachable from y by sequentially moving calls in pro- 
gress from routes that are lower (later) (in the sense of 
— on L,) to routes that are higher (earlier) .* 


It is intended here not merely that, as in (1), each call have a higher 
route in x than in y, but that it should be possible to pass from y to z by 
a sequence of equivalent states each differing from the previous one 
in that one call has been rerouted on a higher route. This stronger 
condition is rendered formally by first defining 


ztQy= Ag t= | x | — 1 and either 
t—(nyDy-—@vuyor 
[e¢|=landxrDy 
and then setting 


D=lTluQu@u-:-: (2) 


= transitive closure of Q. 


* In an attitude prejudiced and justified by the principal results (Theorems 1 and 
2) we are working toward, we use the words ‘lower’, ‘earlier’, and their antonyms so 
as to suggest consistently that lower routes are less desirable than higher, earlier ones 
are preferable to later, etc. 
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IV. PROGRESSIVE GRADINGS 


In a one-stage connecting network vy = (G,J,Q,S), with J the set of 
customers’ lines (inlets) and © that of trunks (outlets), the graph G 
giving network structure is determined entirely by the access relation 
A such that 


LAt = line l has access to trunk ¢. 


The set S of states of »y can be represented by the set of all subsets of 
A which are one-to-one correspondences. The range of x, rng (2), is 
the set of trunks which are busy in 2. 

The access relation A can be used to give a simple definition of a 
progressive grading. We use X X Y for the Cartesian product of X 
and Y, i.e., the set of pairs (x,y) with « e X and y ¢ Y. If X is a set, 
| X | denotes the number of elements of X. 


Definition: v is a progressive grading if and only if it is a one-stage 
network for which there exist partitions II and Z of Q and J, respectively, 
and a partial ordering = of II, such that for 7, U, V eM andleaz 


@) (2 XT) an A # @ implies (L X T) CA, 
(ii) (L X U) CA, V = U imply (LX V) CA, 
(iii) U= T,V = Timply U S$ VorV s$ U 
@) |L|2z| U TI. 


T3(LXT)CA 

The first condition simply says that if a line has access to some trunk 
from a group 7’, then all lines in its line group have access to every trunk 
in 7. The second condition says (roughly) that a line with access to 
a trunk group 7’ has access to all groups that are later than 7' in the 
partial ordering. The third condition says that a trunk group 1s followed 
(in the partial ordering) by at most one other group; if the ‘“‘later’’ 
groups are thought of as overflow groups, this means that each group 
has at most one group to which to overflow traffic. Finally, the fourth 
condition rules out the relatively uninteresting cases in which some line 
group has access to more trunks zn toto than there are lines in the group. 

It is apparent that if a trunk group 7’, is later than one T’, , then every 
line with access to 7’, has access to 7’, . This is the “progressive” prop- 
erty. In analogy with the intuitions expressed in Ref. 2, it should be 
better to use an earlier trunk group than a later one, if both are available. 
Thus, the structure of a progressive grading at once suggests the con- 
jecture that optimal routing will consist of using the early routes in 
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preference to the later or (to anticipate a bit) overflow groups. This 
conjecture is true and follows from Theorem 2. In traditional telephone 
terminology (see E. C. Molina’s appendix in Ref. 3) it states that filling 
a hole in the multiple is preferable to using a final route, and that 
filling an earlier hole is preferable to filling a later one. 

A line group L is said to be a bye if it has access only to ‘‘overflow”’ 
trunk groups, i.e., if 


inf {7:LAT} 


is not minimal in S, where we have written LAT for (LZ X T) CA. 

It is easily seen that in a progressive grading a hierarchy of routes 
can be defined by this rule: r D gq if and only if r ~ q and g(q) 2 g(r), 
where g(r) is the trunk growp used by route r.* This is the natural 
hierarchy of routes associated with a progressive grading; here r _) q 
if and only if 7 ~ q and r is on an “earlier” trunk group than g. In this 
instance, = is also a simple ordering on each g(y~*(y(r))). These simple 
orderings forming the hierarchy of course correspond exactly to the 
preference relation among routes suggested by the natural intuition 
(already mentioned) that there is no point in using a later or “‘overflow”’ 
trunk when an earlier one is available, because possibly fewer lines have 
access to the latter. The relation > defined above on L, extends by 
(2) to all of S. 


V. PARTIAL ORDERING OF PROGRESSIVE GRADINGS 


In a proof to be given later we shall use the fact that the set of pro- 
gressive gradings can be partially ordered by a relation D according 
to the following definition of covering: », covers v, if and only if v2 is 
obtained from », by removing, for some line group L, either (case 1) a 
trunk from the first (in S,) trunk group to which L has access together 
with one line of L if L has access to more than one trunk, or (case 2) 
the trunk to which L has access together with L itself if L has access 
to exactly one trunk. That is, if », is defined by partitions TI, , 4,, a 
partial ordering =, of I], , and an access relation A, , then v, covers v2 
provided that there exist t ¢ II, andl ¢ L ¢ Z. with 


T = inf {U eT:(L X U) C A} 
such that v, is defined by (case 1) 


* Note the shift to the converse. 
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TM, = 1, — {7} + {T — {#}} 

Bo = Bi — tL} + {Lh — {hy} 

>, = 2, with T — {t} for T throughout 

A, = A, — UX {t}) — GY X Q), 

if 7 ~ {t} or A, n (L XQ) € (LX {t}), and by (case 2) 

, = Il, — {T} 
Ho = & — {L} 
2, = 2, — (h X {T}) 
A,=A,—UX {)) -@X Q), 


if T = {t}andA, n (LZ XQ) C (LX {#}). 

For practical purposes a network in which some line group has access 
to no trunks is in all respects equivalent to the same network with those 
lines omitted. For this reason the definition of covering was divided into 
cases 1 and 2, so as to build this equivalence right into the definition. 

As we have said, », covers v, if and only if », results from », by ripping 
out (7) some trunk from a “‘primary”’ group, (27) a line with access to 
it, and (222) all crosspoints associated with these terminals, with the 
proviso that if this leaves some lines with access to no trunks, then these 
lines are also to be removed. Because of this, there exists a natural or 
canonical map yu of the states S(v,) of », into those S(v.) of »., defined 
roughly by the condition that ux is what is left of x after the line and 
trunk that define the covering of v2. by », have been ripped out. The 
canonical map can be defined formally very simply, as follows: A state 
x of v, is representable as a subset of A, which is also a one-to-one cor- 
respondence; similarly, a state of v. is just a one-to-one map contained 
in A; what is left of x after the ripping-out process is Just 


Le = 7 fr A» . 
Thus, if » corresponds to ripping out line / and trunk ¢, and « = {(1,t)}, 
then uz = 6 = zero state. If z = {(1,t,)} ora = {(1,, H} with, #1 
and t, ~ t, then again wx = zero state. Ifa = {(,,4)} v y, withl, = 1 
ort, = t, then wx = py. It is easy to see that if u rips out J and ¢, then 
uwS 18 isomorphic with the ‘‘cone”’ 
{ve Six = {(l,D}}, 


because it does not make any difference whether / and ¢ are present in 
the system and connected to each other, or are just absent. That is, 
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uwS is essentially the set of states of S that remain available if | is con- 
nected to ¢ with a holding-time of +o. 
This notion of a canonical map provides many useful notations. 
It is convenient to extend the p-notation as follows: For 7 ¢ II 


T— {t} if teT and T =H {t}, 
uT = T n range (A,) = 5T if i¢T, 
lb it pata, 


Clearly, u7' is what is left of the trunk group T after the line J and the 
trunk ¢ associated with » have been ripped out. Also, we set 


w= = (WT, , wT.):T, 2 T. , uT, ~ 6, and pT, ¥ 8} 
pe 


{(ux uy): D y, pe ¥ Oorxe = O, and py + Gory = 4}. 
The relation » 2 can be seen to be identical with 2, ; it is a useful 
mnemonic; it defines the hierarchy of routes in the “reduced” system 
v2; the partial ordering induced in S(»,)[= wS(r,)] by this hierarchy is 
precisely yp 2. 


VI. PRELIMINARY RESULTS 
In Ref. 2, for a general partial ordering A, the notation 
sup A,, 
R 


was used for the set 
{y:z € A,, implies yRz} n A,, 


whenever this set was nonempty. The notation was chosen to denote 
a set of R-maximal elements of A,,, rather than an actual R-maximal 
element itself, so as not to prejudge the question as to how many there 
were. It will be shown that if the network » under study is a progressive 
grading, and R = 2 = natural hierarchy, then unless c is blocked in 
x (and A,, is empty) A., always has a C-maximal element which is 
unique to within equivalence under permutations of lines within their 
line groups and trunks within their trunk groups. | 

Let now x be a state and let ¢ ¢ x be a call which is not blocked in z. 
It is apparent that for y, 2 ¢ A., we have either g(y — x) 2 g(z — 2) 
or g(y — x) S&S g(e — x). Hence, there is a yp € A,, such that 


gYo — t) 2 gw — 2) 
0 ee eee 
Yo Q w 
Yo 2 Ww 
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for all w ¢ A,, , and yo is unique to within equivalence. (Recall the con- 
struction of Q in Section III, and the fact that D is J u Q.) Hence, 





sup A,, (sup A,, for short when the context permits) 


exists, and equals 7(yo), 7(-) being the natural homomorphism of S 
into the quotient S/(2 n C). (See Ref. 2.) 
We now consider policies g(-,-) such that 
ea x—h if eisa hangup h, 
be sup A,, if eis a new call c not blocked in z. 


(3) 


Such a policy expresses the routing rule of always choosing the earliest 
available trunk in the natural hierarchy characteristic of a progressive 
grading. 


The relation B (for ‘‘better’’) was defined in Ref. 2 by the condition 
xBy ifandonlyif «2 ~ y and every call blocked in x 1s also blocked 
in y. 


By Theorem 1, to be proved shortly, it will follow that x D y implies 
x B y, which in turn implies s(x) 2 s(y). Thus, the policies ¢(-,-) coin- 
cide with the ‘‘maximum s(-)’ policies suggested in Ref. 2. (See Ref. 
2 for notations.) 


Lemma 1: If the line of c 1s not involved in the canonical map p, and 
A.z ~ 0, then 


u(sup A,,z) Csup Avy) - 


Proof: Let l* be the line of c, and suppose that 
y esup A,, . 


Let / and t be the line and trunk, respectively, associated with yu. There 
exists a trunk ¢* such that 


y=axvu {(l*,t*)} 
MY wr UV { (* ,t*)} 
t* einf {7:T C rng (x) and I* AT}. 


Let T* denote the set (trunk group) achieving the infimum on the right. 
Since t is busy in x and ¢* is not, t ¥ ¢*. Thus, 7* ¥ {t}, and w(7*) + 8. 

We first observe that /*AT implies [*A.u7, since c is not involved 
In yp. 
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We next show that »7’ € rng (ux) implies 7 € rng (x). If not, then 
there exists t, ¢ uJ’, hence ¢ T such that ¢, ¢ rng (ux) and ¢, € rng (2). 
But 


rg (ux) = rng (x) — {t}. 


Hence, ¢; = ¢ = trunk removed by u. But this is impossible since ¢, € u7’, 
while ¢ ¢ uT’. 


Now 7* S T for every T such that 7 € rng (x) and [*A7. From the 
two previous paragraphs, it follows that 
(uT*)(u S)pT 
for every 7 such that u7 € rng (ux), *AuT, uT ¥ 0. That is, 


uT* = inf {7:T € rng (ux), [* AT}. 


Now @ ¢ 7%, * 4 t,so * eyT*. If now w € Agyzy , then 
(w — px)(uS){(l*,t*)} 
wiuc)(ua vu {(l*,t*)}) 
w(uC)py. 
Thus, 


wy ESUP Agius 
pS 
and since y was arbitrary within sup A,, , the lemma is proved. 
> 


Lemma 2: In a progressive grading, Q € B. 


Proof: Let x Q y. This implies that there exists z e B, a B, such that 
x—-—-2ZDY— 2,14, 


gy —2) 2 90 —s): 


Now let ¢ be a call from line / which is blocked in x but not in y. Then 
c is not blocked in z either. The only trunk which is busy in x and not 
in z is that used by the call y(a — z). Thus, since c is blocked in x and 
not in z, g(x — 2) is a trunk group usable for the call c. However, by 
property (22) of progressive gradings, {l} & g(y — z) C A, Le, J has 
access to the group g(y — z) as well. Hence, some trunk of g(y — 2) 
is idle in x, since the call y(2 — z) has a choice of routes in state z, one 
of these being on g(y — z). Thus, c is not blocked in x, and zx B y. 


Theorem 1: In a progressive grading, the partial ordering > induced by 
the natural hierarchy of routes 1s contained in B. 
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Proof: Immediate from Lemma 2 and the facts that > is the transitive 
closure of J u Q, and that B is transitive. 


Lemma 8: If x D y, then x 1s obtainable from y by moving calls to earlier 
routes in such a way that each call is moved at most once. 


Proof: The result is true if only one move is made. Suppose it to hold 
if nm moves 2n toto are made. Let x be obtainable from y by sequence of 
(n + 1) moves. The trunk groups available for a given call c form a 
set simply ordered by S, and so can be indexed 1, 2, --- , the S-earlier 
receiving the lower integer. For c S y(z), let n(c,x) be the index of the 
group used by ¢c in x. Some call c that is moved in obtaining x from y 
achieves 


min {n(c, , x) | c, moved in getting xz from y}. 


Starting in state y it is possible to move such a call (once) directly to 
its route in xz, to get a state 2 in which it is still possible to carry out 
exactly each of the moves that take y into x except those involving c. 
These are at most 7 in number, so each call involved need be moved at 
most once. 

A policy ¢(-,-) is said to preserve a relation R C ~ if « R y implies 


y(e,x) RB e(e,y) 


for every event e that is either a hangup or a new call not blocked in 
either x or y. It has been shown in Ref. 3 for a general network that if 
gy preserves B then it embodies the optimal routing policy for accepted 
calls. 

The main theorem we prove (Section VII) states that a sup A., 
policy, i.e., one satisfying (3), preserves >. The method to be used in 
the proof of this result is illustrated in part by the following remarks: 
consider linear arrays 2, y, 2, --- each of m urns, n = 2, each urn con- 
taining at most one ball, with fewer than n nonempty urns per array. 
Let x D y mean that x is obtainable from y by moving balls to the left. 
Let yx denote the result of adding a ball in the leftmost empty urn. 


Observation: If x D y, then gx D gy. 


Proof: The result is obviously true for n = 2 by enumeration. Let it 
hold for a given value n = 2, and consider arrays x, y of n urns satisfying 
the hypotheses. Let yz denote the result of removing the leftmost urn 
from z, and bz that of adding an urn containing one ball at the left of 
z. There are two cases: (2) the leftmost urns are empty in both x and 
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y, or both nonempty in z, y; (72) in y, but not in z, the leftmost urn is 
empty. 

Case (7): gx = bax, gy = bly, px D> Wy; hence, gr D gy. 

Case (iz): In obtaining x from y some ball moved into the leftmost 
urn. Obtain z from y by moving just this one ball to the leftmost urn. 
Then + D zD y, ot D v2, gy = blz, oz = bez. Since gz is obtained 
from wy by removing some ball, and replacing it in the leftmost empty 
urn of the resulting array, we have gyz D Wy, and so gz D gy. 

In cases 3 and 4 of the proof of the next theorem, the analog of the 
inductive index n will be the partial ordering of the set of progressive 
eradings. 


VII. PRINCIPAL RESULT 


Theorem 2: In a progressive grading v let D be the partial ordering in- 
duced by the natural hierarchy of routes in v, and let ¢ be a policy with the 
property that 


y(c,x) € sup A, , c ex, c not blocked in x. 


Then ~ preserves >. 


Proof: The proof is by induction over the partial ordering > of the 
set of progressive gradings which is defined by the definition of covering 
given earlier. A grading v that is minimal in D has no “‘overflow groups’’, 
1e.. 2 = identity relation, so that no trunk group has a successor in 
the order 2 characteristic of »v. Thus, v consists entirely of trunk groups 
serving line groups on a one-to-one basis, so that for some n 


A — e (DL; P< T:), 
t=1 


where 
B= {L;,=1,---,n} 
Il = {7,;, =1,--:: ,n}, with iT; |= 1, 


In this minimal case > is the identity relation, and ¢ obviously preserves 
it. 

As a hypotheses of induction, we now suppose that every progressive 
grading covered by v has the property that any sup A,, -policy preserves 
—. Let now x D y inv and let e ¢ x. The induction argument will have 
four cases, the last two of which are analogous to the observation made 
earlier. 
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Case 1: x D y, and e is a hangup h. There is a sequence & = 2, ,22,°°° , 
2, = y with 

e987 jeder yn— 1. 
This sequence indicates how one would get y from x by moving calls 
to “preferred” routes. By Lemma 3 it is no restriction to assume that 
no call is rerouted more than once. Let the route of h be rin x and q in y. 


If h is one of the calls whose route is changed in the above sequence, say 
to take z, into 2,,, by changing the route of h from r to q, then 


is a sequence which shows that (2 — r) D (y — q). If the route of h is 
not changed, then r = g and the same conclusion follows. 


Case 2: x D y, and e € x is a new call c blocked in x. By Theorem 1, 
x B y, so c is also blocked in y. Then, 


A cz = {x}, Ay ~ ty} 
y(c,z) = x g(c,y) = y 
y(c,x) 2 ¢(c,y). 


Case 3: x D y, eis a new call c not blocked in either x or y, and the line 
group L of c is not a bye. Let 


T = inf {S:LAS}. 


iw 


Subcase 3.1: T is full in neither x nor y. Then there exist routes r, q¢ 
such that g(r) = g(q) = T, 


o(c,x) =a“ U?, o(c,y) =Y¥ Vv 4q; 
r = g modulo trunk permutations within 7’, and clearly 
g(c,2) ~ g(c,y) 


since c was put up on group T in both cases. To see this, if « = 


21,22, °° » 2, = Y IS a Sequence with 
2; O 2341 g=1,°-°: ,n—I, 
showing that x > y, then 
LUr=2, UU yk UrT=YuUrY Y gis a sequence which 
shows that 


(@urQyv q. 
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This is because we can assume without loss of generality that the trans- 
formations which change y into x reroute a call at most once, and thus 
move no calls onto 7’. (Lemma 3.) 


Subcase 3.2: T is full in both x and y. Since L is not a bye, there exist 
l_.melLandt, ue T with 


(l,d)ex and (mu) ey. 


Because 1, m and é, wu are respectively interchangeable, 1.e., since lines 
and trunks are permutable within their respective groups, no loss of 
generality is incurred if it is supposed that 1 = m andt = n. Let pu be 
the canonical map corresponding to ripping out / and Z. 

Then v covers v, , where y, 1s defined by ripping / and ¢ out of », 1.e., by 


in —< in case 1 
I — {7} in case 2, 
ume ‘— in case 1 
m— {D} in case 2, 
> with 7 — {t} replacing 7’ throughout, in case 1 
w= = 2, 
2 Sy Xt) in case 2, 
th A cane in case 1 
A— (UX {t}) - ZX Q) in case 2, 
with 


caseel=T7+ {t} or An(LT) C(LX {t}) 
casee2=T = {t} and An (LT) C (L®&X {t}). 


The line of c is not involved in p, and A,, ~ 0, A., ~ 6. Hence, Lemma 
1 gives 


po(c,x) Sd A eur) 
we(c,y) Esup Agu) . 
re) 
Since x D y, and either both wx = 0, py = 0, or neither, we have 
(ux my) € wD. (4) 
Let ~ be a policy for », with 
E(d uz) € hes A atue) ? Vd E pe. (5) 
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The hypothesis of induction and (4) give 


E(c,ux)(u>)E(C, my). (6) 
However, by (6) and (5) 


po(c,x)(uD)E(C,ux) 
(uD )JE(C uy) 
(u>)ue(c,y). 


But ¢(c,x) differs from ye(c,x) and ¢g(c,y) from py(c,y), only in having 
an additional line 1 and an additional trunk ¢ connected to each other. 
Hence, 


(cx) ~ o(C,y) ° 


The argument of subcase 3.2, basic to Theorem 2, can be appreciated 
by looking at it thus: x D y means 4z,, °°: , & with 2; Q 241,7 = 
lee, n—14=2,2 = y. Since 


e= {0,0} Szany 


we have r S z;,7 = 1, --- , m because we can assume that the call 


using r is not moved as y is transformed into zx by moving calls. Thus, 
(2; —7r) Q @is1 — 1), ~=1,°---,n-1 
@—rnQdy >). 


But the ‘‘cone” {z:z = r} is isomorphic to the states of a grading (», of 
the proof) covered by v and the isomorphism, 22z., uw restricted to the 
cone, has the basic property, for x, y in the cone 


x 2y if and only if (ux)(u2) (my). 


Subcase 3.3: T is full in x, but not in y. Since ZL is not a bye it is S- 
minimal, and hence there exists a call d with d S y(x) nm y(y) such that 
dis on T in x is not on T in y, and can be moved to T in state y to give 
rise to a new state z without rendering impossible any the remaining 
moves which transform y into x. Thus, 7 D z D y. Since x n z ¥ 8, 
subcase 3.1 gives v(c,x) D (cz). Further, the route of c in ¢(c,z) is 
no higher (later) in $ than the one in y left by d as it was moved to 
T to give rise to state z. Hence, to within equivalence 


y(c,z) D v(c,y). 


Case 4: x D y, € is a new call c not blocked in either of x or y, and the 
line group of c is a bye. There is at least one other line group LZ which 
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is not a bye. Let 
T = inf {S:LAS}. 


s 


Subcase 4.1: L XT nxw4#0LXTanyxroboorLxTnz= 8, 
LXToay = 6. Since L is not a bye, there exist 1,m eLandt,u eT 
with 


(bt) 2%, (m,u) Ey 
or 
(1,t)¢2, (mu) #y. 


(In the second instance, property (2) of the definition of a progressive 
grading has been used to conclude that there must be idle lines on L 
if there are idle trunks on 7.) 

As in subcase 3.2, no loss of generality is incurred if it is supposed 
that 1 = m and ¢ = wu. Let uw be the canonical map corresponding to 
ripping out J and ¢. The argument now continues as in subcase 3.2. 


Subcase 4.2: (L XT) nx4#90,(LXT) ny = 6.Since Lis S-minimal, 
there exists a call d with d = y(x) n y(y) such that d is on T in @, is 
not on 7 in y, and can be moved to 7 in state y to give rise to a new 
state z without rendering impossible any of the remaining moves which 
transform y into x. Thus, x D z D y. Since x n z ¥ 6, subcase 3.1 gives 


e(c,x) 2 9(c,2). 
Let [* be the line of c, r be the route of d in y, Tz = g(r), and 


T,. = gQnf {S:*AS,S € rng (y)}). 
Here 7’, is the earliest group c could be put on in y. Let also y denote 
the operation of moving d from T, to T, and for any call f 
A; = tor): = v(r)} 
= {S:the line of f has access to S}. 


Case (1): Tz € A, N Ag, Ta S T,. Then moving d from T,; to T means 
that c can use 7', in Zz, so o(c,z) > o(c,y), because ¢(c,z) results from 
y(c,y) by moving first d to J, and then ¢ to T, , so actually 


o(C,2) — volc,y). 
Case (11): A, nN Ag = 0, or A, A Ag ¥ 6 and either 
T.,T2aeA,-AAz, 
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or 

Tae A, n Aa, Tee Ach Ags 
or 

T.&EA, Nn Aa, dae: rs ae ae 
or 


T.,T2e A. n Aa, fie es rae 
In all these cases Yo(c,y) = v(e~y) = ¢(c,z), whence 
o(cz) > ¢(¢,y). 
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Integral Equation for Simultaneous 
Diagonalization of ‘Two 
Covariance Kernels 


By T. T. KADOTA 
(Manuscript received January 20, 1967) 


Let K,(s,t) and K,(s,t), —T S s,t S T, be real, symmetric, continuous 
and strictly positiwe-definite kernels, and denote by K, and Kz the cor- 
responding integral operators. Let x(t) be a sample function of erther of 
two zero-mean processes with covariances K,(s,t) and K.(s,t). We prove 
a generalized version of the following: If the integral equation 


, (Key) = \ (Kip), =f StS 1, 


has formal solutions \; and W;(t) which may contain 6-functions, and 
af {K,;} forms a complete set in &.[—T,T], then (2) the two kernels have 
the following sumultaneous diagonalization: 


K,(s,t) = 2d (KwWwoKiw, 
K,(s,t) = s (Ki ¥.)(s)\(Kivs)(Q, 


uniformly on [—T,T] X [—T, T], and (22) the sample function has an 
expansion 


x(t) = 2 (x,y) (Kidd) 


in the stochastic mean, uniformly in t, and the coefficients are simul- 
taneously orthogonal, 7.e., 


Et (w,wi)(@v;)} = 6s , Eni (x,vi)(a,W;)} = rz bi; , 
where (x,;) ts obtained by formally integrating y(t) against x(t). 


I. INTRODUCTION 

Let K,(s,t) and K.(s,t), -—T S s, ¢ S T, be real, symmetric, con- 

tinuous and strictly positive-definite kernels, and denote by A, and 
883 
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Kk, the integral operators with kernels K,(s,t) and K.(s,t). We have 
previously’ established that, if Ky7?K.K7? is a densely defined and 
bounded operator on £, (the space of all square-integrable functions 
on [—T7,T]) and if its extension to the whole of £, has eigenvalues 
\; and complete orthonormal eigenfunctions ¢,(t), 7 = 0, 1, --+ , then 
the two kernels have the following simultaneous diagonalization: 


K,6,t) = pe (Kig,)(s)(Kiv) (0), 
K,(s,t) = d Xj (Kio, )(s) (K%e,) (t) 


(1) 


uniformly on [—T7,7] X [—T7,T]. In addition, if x() is a sample func- 
tion of either of two (separable and measurable) zero-mean processes 


with covariances K,(s,t) and K,(s,t) with associated measures P, and 
P, , then 


a(t) = » ni(2)(Kig,) (t) | (2) 
in the stochastic mean, uniformly in ¢t. Moreover,* 
Ey {ni(a)nj(a)} = 43; , Lat n(x) ni(ax)} = rx be 
wheret 
n(x) = lim (2,K7*y;,) (3) 


in the stochastic mean, and {¢;,} 1s any sequence of functions in the 
domain of Kj? such that lim || 9; — ¢;, || = 0.°°? Furthermore, if the 
two kernels have continuous 27th derivatives (0°’/ds’dt’)K,(s,t), p = 
1, 2, then (1) and (2) can be differentiated term-by-term r times while 
retaining the same senses of convergence.’ 

We remarked in Ref. 1 that, if y; is in the domain of K;?, ¥; = Ky»; 
satisfies the integral equation 


(Kov)) = Ky), =a Jt, (4) 
and 
n(x) = (x,y,;) a.s. (almost surely), 
(Kip) (1) = (Kis) (). 


Slepian (private communication) has long conjectured that, if (4) 
admits formal solutions \; and y; , 7 = 0,1, --- , where y; may contain 


(5) 


* Ey, p = 1, 2, denotes the expectation with respect to P 
+ For any f, g eL2 , (f,g) denotes the inner product of f and g, and || f || the norm 
of f. 
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6-functions and their derivatives, then the expansion coefficients and 
functions of (2) are given by formally substituting such y, into (5).* 
This conjecture, proved here, is significant since it provides a concrete 
means of obtaining the expansions (1) and (2). To illustrate the point, 


consider the following pair of covariance kernels: 
KG). =e. K,(s,f) = e Plt", 


For this pair, (4) admits the following formal solutions” 





Ynll) = cos Ot + SS [ae — 7) + ae + 1) 
k=0,1, °°, 
Jonax(d) = sin 0,¢ + 22 ee 7 (ET) — a+ 7), 
corresponding to 
ee Ba + 6 . _ Ba + 6, 
a B+ 6’ sea 2 a ee 


where 6, and 6, are positive solutions of 
(a + B)6, tan 6,7’ = aB — 6; , 
—(a + 6)6, ctn 6,7 = o8 — &, 


respectively, indexed in ascending order. Thus, formally, 


(, $e) = I. " (2) cos Ot dl ae a [x(T) + 2(—T)], 
(2.Pn) = [a sin dt at + OE faery — 2(—1)), 
(Kitu)(t) = a, 72 008 Onl 
(Koto) = ir sin 6,6. 


Through a direct calculation, we previously” established that 


(7) K7?K.K7? is densely defined and bounded, 


(6) 


(8) 


(9) 


(10) 


(22) its extension has eigenvalues \; given by (7) and complete 


* Similar conjectures have been made elsewhere.?:4 


886 THE BELL SYSTEM TECHNICAL JOURNAL, MAY-JUNE 1967 


orthonormal eigenfunctions ¢; given as 
nr 
yg, = ¢; lim. 2 Mihi shah, 
7=0 


(iid)n; = c,(z,b;) as.,* Kip; = c;Ky);, which verifies Slepian’s 
conjecture for this example. Here c; is a normalization constant given by 


C Jos | (7 Ae fat Blab yf 
- a + 6 Rte +B) +a /) ’ 








Cox+1 = Cox lon—dx ; 
(that is, Co.4; is obtained by replacing @, with 6, in co,), up; and fp; , 
p = 1,2,7 =0,1,--- , are the eigenvalues and orthonormal eigenfunc- 
tions of K,, and (y; , f:;) 1s defined analogously to (9). 
In this paper we prove the generalization of (2), (27), and (222), starting 
with abstract kernels K,(s,t) and K,(s,t) and a generalized version of 
the integral equation (4). 


II. MAIN RESULT 
Theorem: Let K,(s,t), p = 1, 2, -—T S s,t S T, be real, symmetric, 
strictly positive-definite kernels with continuous 2 rth derivatives 
(0°"/dsdt’)K,(s,t). If there exist sequences of real numbers {Qitm}, 

{tnt: -T St, S T, and {),}: 
0<b £1; 85 &, ~=0,1,---, (11) 


for some constants b, and bz, , and sequences of square-integrable functions 
{W.1}, which satisfy the equation 


r 


> i (2, KG) Yad aa x — 2 K,(s,t) = 


L=0 


(12) 





r T a’ a : a: 
ee | i (2, K,(s,) Wald dt + )° dim = K,(s,2) : 
L=0 —-T Ot | m=1 ot t=tm 
—Tes HT, 


such that the right-hand side of (12) forms a complete set in £2 , then 
(() K7?K.K;? is a densely defined and bounded operator on £2 , 
(22) ats extension to the whole of &2 has eigenvalues and complete ortho- 
normal eigenfunctions, which are the \; and 


y(s) = > ita dan6 + > tuanK bt) | ’ (13) 


L=0 


* This portion is proved in a separate article.® 
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(412) n; and Kig; of (2) can be given, respectively, by 


n(x) = » ce ad r » carat) | a.8. (14) 


and by the right-hand side of (12) without \;. Here, K?,,, p = 1, 2, 
denotes an integral operator whose kernel is defined as 


Kiou(s, t) re » foi (for OD) b= 0, 1,224 405 (15) 
in the mean in s, uniformly in t. 


Remarks: 
(7) K%,,(s,t) of (15) is well defined since 


art 


op Ke), p=1,2, (16 


2 Mil pi (fri Q) = a 
uniformly in (s,t).” It follows from this that (15) converges in the 
mean in (s,f) as well. Hence, from Fubini’s theorem, K?,,(s,t) is a 
square-integrable function of t for almost every s. Thus, ¢;(s) of (18) 
is well defined. We assume without loss of generality that 9;, 7 = 
0,1, --- , are normalized. 


(iz) For the example in Section I, r = 0,q = 2,4 = T,t, = —T, and 
Wor, o(t) = Co, COS 4,1, Wone1,0(4) = Cox+1 SIN 6,t, 
, a ee 0s 6.7 
2k,0,1 —_ 2k,0,2 ~~ a -|- B ’ 
sin 6,7" 
Q26+1,0,1 = —Gent+1,0,2 = Const ee ’ 
a 
by zt B ’ bs = : ’ 


the right-hand side of (12) without \; is given by (10), and completeness 
of {cos 6,t, sin 6,¢} follows from (18) and a gap-and-density theorem.° 


III. PROOF OF THEOREM 


For notational simplicity, we write K,.1, p = 1, 2, for the integral 
operator whose kernel is 


k+l 


= FF apt Kou), Kt =O, 1, ee 


Kyu 2) Ou 


Kyo and K?,, are abbreviated as before by K, and K?, respectively. 


888 THE BELL SYSTEM TECHNICAL JOURNAL, MAY-JUNE 1967 


(7) For any f,ge&., 


(Koa Ki019) = (f ong); (17) 
Kg Te) aay (18) 
nO 7=0 


To prove (17), note 


T 
(K},.f,K3o.9) = | | | (8)g(t) Kon (1,8) Ktor(u,t) ds dt du 
~T 


=f f 1600 X uP OPO as at 


= (f TK ikg) , 


where the second equality follows from the mean convergence of (15) 
and the third from the uniform convergence of (16). To prove (18), 
consider 


which vanishes as 7 — © since (16) converges uniformly in (s,t). 
(7) K;?Ki and K;?K? are densely defined and bounded on &, . 
To prove this, apply Kz? on both sides of (12) and use (18) to obtain 





n 2 n 
Kjo09 _ »D isola @ a (9,40 5119) a De yi(fn; 19) 


r 


>» Exe i 2 carn ba te) | = \ Kz Kio; . 


t=0 
Then, for each 2, 


nN; || Ka?Kiy, ||? 


= 2 Cone , Khoi) 


k,l=0 


+ DY [asrm(Kbor(* stm), KEonWin)  Ainm(KGoe(+ bm)» Kbori2)] 


+ > Oitm@iin( Kdo1( + Lads Ko ‘} 
1 


m,n= 


T 


= oS {( vi » Ko Wir a p> asinKan( tn) 


k,l=0 


r ve tin] (KenrVed(l) +} ys AitmEoni(tn , | 


TWO COVARIANCE KERNELS 889 


= j, y (hava Kb win ae Se Tee Sal t0)) 


m=1 


a S 


ae asia Khu Ls Rbowar ae Dy QsimE toil 1.) 


n=1 m=l1 


2 
’ 








me ri || @: 


where the second equality follows from (17) and (18), the third from 
k time differentiation of (12) and from (17) and (18), and the last 
from (13). Hence, with 9; being normalized, 


1 


| Ky ’Kig; | = r; ’ 


7=0,1,--: 

Now {y;} is complete since the right-hand side of (12) without ), , 
which forms a complete set by hypothesis, is equal to Kiy;, and K? 
is strictly positive-definite. Hence, from (11), Ky?K? is densely defined 
and bounded. 

To prove that K;?K3 is also densely defined and bounded, define 
¢,; as the normalized right-hand side of (18) with the subscript 1 re- 
placed by 2. Completeness of {¢;} is similarly deduced via (12). Now, 
by following the same procedure with the roles of K, and K, inter- 
changed, we obtain 


|| KriK3¢, 
Then, the assertion follows immediately from (11). 
(vit) The ranges of K? and K? are equal, namely, 
Ki(£2) = Rie): 

To prove this, denote by ZL and M the extensions to the whole of 
&, of K>?K? and Ky'K? respectively, which exist as a result of (it). 
Since the domains of K?Z and KiM are &, , which is also the domains 
of K? and K?, we have 

Ki = KiL, K} = Kim. 
Then, from the first equality, Ki(£.) C Ki(£,), while, from the second, 
K3(£2) C Ki(£2). Hence, the assertion holds. 


Shes 7=0,1,-°-. 





(wv) 
K},,(-,f) = Lim. >> KY fPO, —-Tsis tT, (19) 
NC 7=0 
Kijg = lim. >> K2f,,(f{?,9), ge&. (20) 
N00 7=0 
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To prove (19), note first that fi; , 7 = 0,1, --- , are in the domain 
of Kz? as a result of (i772) and also that (Ky*f1;, K3f:;) = 6;; from 
orthonormality of {f,;}. Thus, {K7#/,;} and {K3f,;} form a pair of 
mutually reciprocal bases of £, . Hence, 


Ki. (-,f) = lim. p K3f1;(Ko"fi; , K3or(- 2). (21) 


ne 


But from (15) 


(Ke thy , Kbor(- ,t)) = p3 (fi; fodfar @, {= 0, 1, ea eg (22) 


uniformly in ¢. Now, since {f.;} is an orthonormal basis of £, , 


fig = lam. D2 (fii, fox) fa: 


oe 


But, according to (22), the right-hand side converges uniformly. Hence, 
the above partial sum must converge uniformly to f,; . Suppose for 
some k,0 Sk <1, 


10 =D Gur feoIPO (23) 


uniformly in ¢. Then, from (22), 


APO = > (hi fedf PO) 
uniformly in ¢.” Hence, by induction, (23) holds for every k,O Sk Sr 
Therefore, from (22), 
(Ke*hi; , Kin()) = fP?@, b=0,1, +--+ ,7. 
Then, (19) follows from (21) and the above. 
To prove (20), we expand K3,, g relative to {K?},,}: 


K3uig = |.i.m. p> (Ke tf , Kh. g) Kih ; ’ 


n—0O 


and note from (18) and (28) that 


(Kshs, Khoa) = Gus dG? 9) = O90. 


(v) To prove (2) of the theorem, we note from (727) and (72) that 
Kk7?K? is everywhere-defined and bounded on £2. Hence, its adjoint 
(K;?K3)* is also everywhere-defined and bounded. Now, for any 
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f « & and g « D(K;'), the domain of K;?, we have (Kj?K3f,g) = 
(f,K3Ky*g). Thus, KiKy*g = (Ky*K3)*g, g « D(Ky*). Hence, KiK;y} 
is bounded. Since (Kj?) is dense in £,, we conclude that Ky7?K.K;? 
is densely defined and bounded. 

(vt) To prove (22) of the theorem, define 


Pin(L) = > Mi; ~ | (ve ’ 1) a > Qitm ba) [ful0. (24) 


and note y;, € D(K;*) and lim || 9; — ¢,, || = 0. Then 


34 
nr 


lim. K.Kjy*y,, = lim. >> >> ie ft) Gan bm) [Reh 
~=0 m=1 


n—->00 nN? j3=0 


r 


> | Kaas + 2 asraK a(t) | 


i=0 


= \,Kig; ’ 


where the second equality follows from (19), (20), (15) and (18), and 
third from (12) and (13). Now denote by Q the extension of K7?K,Ky7? 
to the whole of £, . Then, 


KiQf = l.i.m. K,K7"f, 


no 


for any f € S. and {f,}: f, ¢ D(K7), lim || f — f, || = 0, since 
|| KiQf — KKy"f, || S || Keg — f.) || + || @i@ — K2Kr)f, || 


which vanishes as nm — ©. Therefore, Qy; = X,9,. Lastly, since {g;} 
is complete in £,, {d;} constitutes the entire spectrum of Q. 
(viz) To prove (272) of the theorem, note from (8), (24) and (vz) that 


n(x) = l.i.m. 2d du ie ie) SB a carnliCba) | ’ oy 
Now 


hy, Ga) _ 2d Ce hahr Wa | 





ae (Wir » Karin) = Do mas(ber Py a ee 


iy 





a (2) _ ys @ihohie O | = Kini, — aul POP, 


7=0 
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both of which vanish as n — by virtue of (16). Also, with the use 
of (17) and (18), 


EE, Ge.) ru Dy (a fidhz » i) | 
Pa (Wir ’ Koii1) =e 2 (Wir ti Gis ’ Koo. Wii) 


+ ar APY a PMs » Kehed 


2 
’ 








_ Kiowa — Dy Kahs(h? i) 


By \ 2° O-= D> Giph  @ 


- > fir Ohie Ochi» Koha) = Kha: st) — X KY APO 


both of which vanish as n — © by virtue of (19) and (20). Therefore, 
upon combination of the above results, (14) is proved. 





= Kolb) — 2 DAPOCRarhdO 


2 
’ 
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Principles of Design of Magnetic Devices for 
Attitude Control of Satellites 


By M.S. GLASS 
(Manuscript received December 29, 1966) 


Magnetic devices mounted within an orbiting satellite interact with 
the earth’s magnetic field and produce torque to modify the attitude or 
angular adjustment of the satellite axis of spin. The satellite environment 
dictates that these devices be designed for minimum weight or minimum 
power consumption, or a suitable compromise between these two minima. 
Principles of design of magnetic devices to satisfy these requirements are 
developed in this paper. The resulting design equations and charts enable 
the ready optimization of design and selection of preferred materials. 
While most of this work was directed initially at the Telstar” satellite 
project, the design charts and formulas are found useful in other areas 
of magnet design. Methods of magnetic measurement devised for the satellite 
are discussed. 


I. INTRODUCTION 


Satellites with directional instrumentation, such as the antenna sys- 
tem of the communications satellites, require attitude control to keep 
this instrumentation properly on target. For example, a spin imparted 
to the satellite at time of launch gives it a sort of gyroscopic stability. 
However, complete attitude control requires some available torque to 
correct the direction of the spin axis. 

In the orbiting satellite the earth’s gravitational field is balanced by 
centrifugal force, leaving the earth’s magnetic field as a convenient 
means for interaction torque. Suitable interaction with the earth’s 
magnetic field can be set up by electromagnets, or by air-core coils of 
large area, either of which can be turned on or off at will to provide 
attitude correction as needed. Small permanent magnets can be de- 
signed and installed to cancel out residual magnetic moment in the 
satellite, which if permitted to interact with the earth’s field could 
cause precession of the spin axis. Other miscellaneous torque applica- 
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tions of magnets in the satellite have been proposed and investigated. 

Limitations of payload and of available power in the satellite gen- 
erally make it necessary to design with quantitative accuracy and to 
optimize the factors which control weight and power consumption. To 
this end, the magnet designer may select from various geometries of 
magnet and coil and from various available materials. This selection 
and optimization is facilitated by the use of suitable design formulas 
and charts. In this paper, we review the derivation and illustrative 
use of such formulas and charts. While the work reported here has 
been aimed specifically at certain problems of the Telstar® satellite, 
it is evident that the technique of magnet design presented here is ap- 
plicable to any similar set of problems. 

For the convenience of the magnet designer who buys magnets and 
magnet wire by the pound, measures them in feet, inches, or mils, 
and measures torque in pound-inches, all of the derived design 
formulae and graphs are built around the practical units (inches, 
pounds, oersteds, gauss, etc.). This avoids the necessity of converting 
units, which is time consuming and can lead to costly errors. There 
is included for convenience a table of the most frequently used con- 
version factors (Table I). 


II. QUANTITATIVE DESIGN OF AIR-CORE COIL FOR TORQUE 


The torque characteristic of the air-core coil is derived from the 
galvanometer formula which, in some textbooks, is written in MKS 
units: 


wt 7, 
4n H, 





NIA (ampere-turn-meter’) = (weber-meters) (1) 


TABLE I—CoNVERSION FACTORS 


1 unit pole (emu) = 47 maxwells 


108 
1 oersted = 4, ampere turns per meter 


= 2.02 ampere turns per inch 


101° 
1 weber-meter = |, emu 
T 


8.85 X 103 
= eee Ib-in per oersted 
T 


1 newton-meter = 107 dyne-cm 
8.85 Ib-in 


I 
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TABLE [J — CHARACTERISTICS OF WINDINGS 


Copper Aluminum 
Rom | aaa ace 
E(volts) 0.75 X SERED 1.21 x Wa) 
W(watts) 2.18 x 1.21 X as (NI)? 
Wet.(Ibs.) 0.321 AP 0.0983 AP 


(Power X Wet.) | 0.241 x 10-*P{NT) | 0.119 x 10-8PXNI)? 


NI: Required ampere turns 
N: Number of turns used 
A: Cross section area of winding (inch?) 
(N times the section area of a single turn) 
P: Average length of turn in winding (inch) 


and may be written in practical units: 





NIA(ampere-turn-inch*) = 1.667 xX 10° “ 


a 


\b-in/oersted). (2) 


Here 7’, is the maximum torque exerted on the coil when its axis 1s 
perpendicular to the field H,, and NIA is the required product of am- 
pere turns and area enclosed by the coil to deliver that amount of 
torque. 

It is convenient to set up a table of formulas from which one may 
translate the geometry and ampere-turn characteristics of the coil into 
power and weight requirements. The power and weight will depend 
upon the winding material used, but practical considerations usually 
limit this to copper or aluminum. So one may take the weight and 
resistivity characteristics of copper and aluminum from handbook 
tables and with the aid of Ohm’s Law derive the formulas of Table II. 
Using (2) and Table II, one may estimate readily the power and 
weight of an air-core coil to satisfy specified torque requirements. It is 
evident that copper has the advantage in lower power consumption, 
but that aluminum offers a greater advantage in weight reduction. If 
power and weight are of about equal importance, then the power- 
weight product should be minimized. It is evident that aluminum has a 
factor-of-two advantage over copper in this characteristic. 
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III. QUANTITATIVE DESIGN OF MAGNETIZED BARS FOR TORQUE 


A magnetized bar, either a permanent magnet or the core of an 
electromagnet, displays a moment, or normalized torque, proportional 
to the product of the volume of the bar by the intrinsic induction 
within the bar. The magnetic moment, M,,, is identified as normalized 
torque in the familar equation 


L'. 
H, 


and the relation between magnetic moment, intrinsic induction in the 
bar, and the geometry of the bar is given by another familiar equation 





N= (MIS), (3) 


 B-H 
Ar 


in which B — AH is the intrinsic induction, A is the cross-section area 
of the bar at the median plane, and S is the effective distance between 
poles. For a magnet of length / and diameter d, one may define a 
shortening factor, Rg = S/l, which evaluates the effect of recession of 
the poles, and rewrite (4) as 





M,. = A-S (emu) (4) 


B-H 
At 


The shortening factor, Rs, has been evaluated by Okoshi.t Okoshi’s 
values are plotted as a function of lJ/d of the bar in Fig. 1, with a 


Mn = 





AIRs . (5) 





eer eae 
i i cd 
Fa Kee Cas 2 ee 


04 0608 | 2 4 6 810 40 60 100 200 400 600 
Wd 


Fig. 1— Effective shortening of magnets with increasing aspect ratio. 
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broken line extrapolation guided by experimental data. If one com- 
bines (3) and (5) after conversion to practical units, the result is 


D, 
H, 


and the required volume of bar to produce a specified magnetic mo- 
ment is given by rearrangement of (6) 





(lb-in per oersted) = 1.55 X 10°°(B — A)AIRs (6) 


1 0.866 X 10° 7. (7) 
Hes B ov H H. 





Vol (in’) = 


3.1 Optimum Bar Shape—The Load Factor 

When a bar of ferromagnetic material is placed in a field of 
strength H,, it assumes a state of magnetization which is commonly 
described by the equation 


H =H, — D;(B — H) 


which may also be written 


Hs = 
Pe Bean - 


Here H, is the applied magnetizing field, and B and H describe the 
condition of magnetization within the bar. The demagnetizing factor, 
Dz», which is partially defined by (8) is used to express the dependence 
of the intrinsic induction within the bar upon the aspect ratio (l/d) of 
the bar. It has been tabulated and charted as a function of l/d by 
Okoshi,? Bozorth and Chapin,? and others. These sources agree upon 
the value of Dz for long slender magnets. For shorter magnets, where 
there is some disagreement, we find the Okoshi data to be in agreement 
with experiment. 

In plotting the characteristics of magnetic materials we normally 
plot the intrinsic induction (B — H) or the flux density (6) as the 
dependent variable, and the field strength (H, or AH,) as the in- 
dependent variable. Hence, the ratio 

B-—-H A. 


HA, —H 7 (9) 


becomes the slope of the generalized load line of the magnetized bar. 
This reciprocal of the demagnetizing factor is found useful in numerous 
magnetic calculations and possibly deserves a name and symbol of its 
own. We have elected to call it the loading factor, with the symbol U, 
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and have plotted it as a function of //d in Fig. 2. In the electromagnet 
core operating below saturation, H is generally negligibly small compared 
with either B or H, , and the expression for the loading factor reduces 
to U = B/H,. In the permanent magnet, H, disappears and the ex- 
pression for loading factor becomes U = (B — H)/(—4H). For long 
magnets, (I/d > 5) H is negligibly small compared with B, and the 
loading factor is further simplified to U = B/(—4H). In this restricted 
form the loading factor is identified with the “‘permeance coefficient’ 
and similar terms used in the literature of permanent magnets. 

In Fig. 3 we illustrate the application of the loading factor to the 
analysis of permanent magnets and electromagnets. For this illustra- 
tion each is assumed to have I/d = 33 so that the loading factor, U & 
400. In the permanent magnet (Remendur) the magnetomotive force 


SPHEROID 


nes o/f SE 
fn 





2 4 6 810 20 40 60 100 200 500 
Wd 


Fig. 2— Variation of load factor with aspect ratio. 
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x103 


SLOPE = U=400 // 





Fig. 3— Application of load factor to magnet design: (a) permanent magnet 
(Remendur 38), (b) electromagnet (Permalloy 45 core). 


is generated within the magnet and varies with the loading of the 
magnet as indicated by the B, H curve. (Here 4H is sufficiently small 
so that B is indistinguishable from B — #H). The operating point is 
determined by the intersection of the B, H curve with the load line 
of slope U. This is a fixed point for a particular magnet with a par- 
ticular condition of magnetization. The conditions of Fig. 3 were chosen 
to match the characteristics of Remendur. In the electromagnet when 
operated below saturation, H is negligibly small so the load line rep- 
resents the relation between flux B in the bar, and applied field, H, , 
up to the region (around B = 10,000 for Permalloy 45) where the core 
material starts saturation, and the characteristic starts deviating from 
the straight load line. 
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3.2 Design of Permanent Magnets for High Torque-W eight Ratio 
The weight of the magnet in pounds is 


W,. = Alp, (10) 


in which p is density of magnet material in Ibs/in°. One may combine 
(10) with (6) to obtain 


T, — 1.155 x 10° 
WH 


Here the left-hand side of (11) is the normalized torque-weight ratio. 
The design objective is to maximize this ratio. 

The dependence of the operating point of the permanent magnet 
upon the load factor, U, has been illustrated in Fig. 3. On similar charts 
one may plot intrinsic induction (B — H) as a function of field (A) . 
for various magnet materials as in Fig. 4. Values of B and # for these 
plots may be derived readily from the regular demagnetization curves 
supplied by magnet manufacturers. Then, for a particular value of 





(B- EDRs. (11) 


AN 
NUH 


Ni 
= 


NW 
XS 





OERSTEDS 


Fig. 4—JIntrinsic induction of permanent magnets; (a) directional grain 
ceramic, (b) Alnico 9, (c) Alnico 5. 
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Fig. 5— Variation of torque-weight ratio with aspect ratio; (a) directional 
grain ceramic, (b) Alnico 9, (c) Alnico 5, (d) Remendur 38. 


l/d one may pick off the corresponding value of U from ITig. 2, and 
using this as the slope of the load line may find the value of (B — H) 
for a particular magnet material at the point of intersection. This 
value of (B — H) and the value of p appropriate to the material may 
be inserted in (11) to give the normalized torque-weight ratio. For 
example, at l/d = 4, U & 17.5. The load line of that slope intersects 
the intrinsic induction curve for Alnico 5 at (B — H) = 10,000. In- 
serting this value and the value of Rs in (11) and using p = 0.26 for 
Alnico, one obtains 

_f, 

WH, 
Repeating this procedure for various values of l/d and for various 
materials, one can assemble the necessary data to plot the curves of 
Fig. 5. 

It is evident that for each magnet material there is a value of l/d 
above which the torque-weight ratio is essentially constant, and below 
which the torque-weight ratio falls off rapidly with decreasing [/d. 
This follows the shape of the demagnetization curves of Fig. 4. This 
value of I/d is large for magnets having low coercivity, and small for 
magnets having high coercivity. One would normally design the mag- 
net to operate on the flat part of the characteristic to obtain high 
torque-weight ratio. 


= 3.6 x 10° (12) 


3.3 Design of Electromagnets for Torque 

The electromagnet is assumed to consist of a cylindrical core of 
ferromagnetic material with a solenoid wound around it. The design 
formula for the core is the same as that for the permanent magnet and 
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is given in (7). This gives the required volume to produce a specified 
moment, operating the core at a specified value of flux density. The 
ampere-turn requirements are derived as follows. 

In terms of equivalent ampere turns, the applied field 1s given by 


NI 
2.021 
If, as is usually the case in the electromagnet, H is negligibly small 
compared with H,, then one may combine (8) and (138) to obtain 

NI = 2.021D,(B — H). (14) 


If one multiplies each side of (14) by (Al)3 and collects terms, one 
obtains 


H, = (13) 


l 





NI = 2.02 Tap! D,(B — H)(AD}. (15) 
But, 
Combining (15), (16), and (6) 
NI = 19 X 10 (AN = D,(1/a). (17) 


Since Dy, and Rg are functions of ee ratio (lJ/d) one may define 
an aspect ratio factor, 


F, = Dz(Rs)"\(l/d)' (18) 
and chart it as a function of (//d) as in Fig. 6. Then combining (17) 
and (18), gives 


ae Hi is (19) 


For any proposed geometry of an Seeeauane with specified value 
of magnetic moment (7',/H,) the ampere-turn requirement may be 
determined from (19) and then translated into power and weight re- 
quirements by reference to Table II. 





NI = 1.9 X 10° 


3.4 The Semi-Permanent Magnet 


A permanent magnet material with low coercivity and high rema- 
nence, as exhibited by Remendur in Fig. 3, offers the inviting possi- 
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Fig. 6— The aspect ratio function (Fa). 


bility of easy magnetization and reversal of field by means of short 
pulses of current through a winding. Between pulses it acts as a perma- 
nent magnet, with polarity determined by the direction of the preced- 
ing pulse. Thus, it provides a switchable field with very low expendi- 
ture of power. It requires, however, for complete demagnetization, or 
“knock-down,” much more sophisticated circuitry. Also, the certainty 
of complete demagnetization from an applied pulse or series of pulses, 
depends to a considerable extent upon the preceding history of the 
magnet. For this reason, it is not likely to replace readily the simple 
air-core coil or electromagnet unless the available power is so 
severely limited as to justify the added circuit development effort. 


IV. INTERCOMPARISON—AIR-CORE COILS AND ELECTROMAGNETS 


In the preceding sections we have developed design formulas and 
design graphs which enable us to estimate with fair quantitative ac- 
curacy the size and weight of various magnetic structures to satisfy 
torque requirements as specified. Intercomparison of the air-core coil 
and electromagnet offers an interesting illustration of the use of these 
techniques. We consider a typical example which assumes a spherical 
satellite of 45 inches effective diameter in which there is required an 
available magnetic moment of 0.2 lb-in per oersted which can be 
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turned on or off at will. It is further assumed that an upper limit of 
nine pounds weight and twelve watts power consumption is to be 
imposed upon the magnetic circuitry. 

First, we assume that an air-core coil is laid out around the equator 
of the satellite to enclose maximum area, and that this area is 


A= 4 (45)” = 1590 in’, 
Substituting this value of A in (2) gives the required ampere turns, 


_ 1.667 X 10° 
~ 1.590 X 10° 
The average length per turn of winding is 457 = 141 inches. Using the 


formula for aluminum from Table II we can show that the power X 
weight product is 


NI 0.2 = 210. 


power X weight = 0.119 * 10°°(141)°(210)” = 105. 
If we use the total weight allowance of nine pounds for the winding 
then the required power is 


power = > = 11.7 watts. 


This is within the permitted 12 watts, so we have shown that it is 
feasible to use an equatorial coil. 

Turning now to the design of the electromagnet, one inserts T,/H, = 
0.2 and (B — H) = 10,000 in (7) to show that the required volume of 
core material is 


_ 0.866 X 10° 
- 10° 
Assuming density of 0.26 lbs/in’, the core will weigh 4.5 pounds. 

The characteristics of the winding, however, are closely dependent 
upon the aspect ratio of the core. To illustrate this point we consider 
two shapes, one to be 10 inches long, the other to be 45 inches long 
to just fit in along the spin axis of the satellite. For the 10-inch core, 


vol 0.2 = 17.32 in’. 


7 (LO)a” = 17.32; d=1485"; Wd=6.73; F,=0.1. 
Substituting these values in (19) gives the required ampere turns, 


_ 19 X 10° 


NI = “ae gpr 0.20.1) = 5640. 
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If we assume the average diameter of the winding is 1.6 inches then 
the average length of turn, P = 5.05 inches. We insert these numbers 
into the power X weight formula for aluminum in Table IT and obtain 


power X weight = 0.119 * 107°(5.05)7(5.64)> X 10° = 97. 


If we let the winding weigh 4.5 pounds to use up the residue of the 
weight allotment, then the power requirement is 21.5 watts. Since the 
maximum allowable power dissipation is 12 watts, it is evident that 
the 10-inch electromagnet, as described, cannot satisfy the require- 
ments. 

For the 45-inch core: 


7 (45)a = 1732; d=07"; Wd=64; F, = 0.012 


and substituting these numbers in (19) gives the required ampere 
turns, | 


1.9 x 10° 
ee (17.32)! 


If we assume that the average diameter of the winding is 0.85 inch, 
then the average length of turn, P = 2.67 inches, and 


(0.2)(0.012) = 680. 


power X weight = 0.119 X 10°°(2.67)*(680)" = 0.39. 


So we may use 0.5 pound of winding to bring the total weight only to 
five pounds, and the required power will be only 0.78 watt. This 
illustrates the advantage of the long slender electromagnet over the 
short one for purposes of producing torque. 

It is evident that the specified conditions of the example can be 
satisfied by the equatorial coil or by the long slender electromagnet. 
On the basis of the calculated results one might well prefer the 
electromagnet, which satisfies the requirements with a substantial 
saving of power and weight. However, other factors must be con- 
sidered. It is not likely to be convenient to mount the magnet full 
length along the spin axis because of interference with other equipment. 
In a core of this length, a very small amount of residual magnetization 
after removal of current will result in a considerable magnetic mo- 
ment, rather than the desired zero magnetic moment which is charac- 
teristic of the de-energized air-core coil. The weight distribution of 
the electromagnet along the spin axis decreases the spin stability, 
while the weight distribution of the equatorial coil enhances the spin 
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stability of the satellite. For these and other reasons the equatorial 
coil remains a favored method of attitude control in the communica- 
tions satellite. 


V. OPTIMUM DESIGN OF PERMANENT MAGNETS FOR FRICTION DAMPING 


There have been various proposals to provide friction damping of 
roll or precession of the spin axis by mounting a small magnet within 
a hollow spherical enclosure attached to the satellite. The magnet 
would tend to maintain its alignment in the earth’s field and to pro- 
vide damping through friction contact with the interior of the sphere. 
For this application, if it exists, or for any similar application, one 
would wish to design for maximum normalized torque and minimum 
normalized period of oscillation in the field and within the confines 
of the sphere. 


5.1 Design for Maximum Moment within Limiting Spheres 
Referring to (6) and dividing through by D* where D is diameter 
of sphere in inches, and D? = (d?+/?)%/?, 
i 
H,D° 
The relation expressed by (20) is displayed in Fig. 7 for the same 


magnet materials for which the torque-weight relation was shown 
in Fig. 5. 





= 0.91 x 10°°I/d[1 + (l/d)*]""[B — HR, . (20) 


5.2 Design for Minimum Period of Oscillation within Circumscribed 
Sphere 
A magnet used to damp out roll or precession should have a natural 
period of oscillation in the earth’s field much shorter than the period 
of the motion it is to damp out. This would suggest a magnet designed 
to have minimum period of oscillation with a spherical enclosure. 
The moment of inertia of the cylindrical magnet around a diameter 
through its equator is given by 


ee ee aera 
= ag 4 Lie * on 
and dividing through by D5, 


M; 
D® 
* gy = 384 in/sec/sec. 


= 42.6 X 10°*p(2/d) i ames ULE 


1+ (i/d)? (22) | 
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g. 7— Normalized moment within limiting sphere; (a) ceramic, (b) Alnico 
9, co “Alnico 5. : 


Combining (20) and (22) and collecting terms, one obtains, 
M; H, 46.8p[3 + 4(I/d)’] 


T, D? Rs(B — H\[l + (/a)’) 


The period of oscillation is given by 


T = 214 Pe sec. (24) 

Combining (23) and (24) yields, 
tVHe _ yo g | lB + 4(V/d)"|_ 
D “NVRs([B — HV + (/d)’] 


This normalized period of oscillation is displayed graphically as a 
function of l/d in Fig. 8. In designing a magnet for friction damping 








(23) 














(25) 
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Fig. 8 — Period of oscillation within limiting sphere; (a) ceramic, (b) Alnico 9, 
(c) Alnico 5. , 


one would probably select the best compromise between maximum 
torque displayed in Fig. 7 and minimum period of oscillation as shown 
in Fig. 8. This would suggest the use of Alnico 9 and design for J/d & 1.5. 


VI. CHARACTERISTICS OF SPHERICAL AND SPHEROIDAL MAGNETS 


The spheroids are a family of solids the surfaces of which are 
generated by ellipses revolving around an axis. Revolution around a 
major axis generates a prolate spheroid for which //d > 1. Revolution 
around a minor axis generates an oblate spheroid for which l/d < 1. 
Revolution of a circle around a diameter generates a sphere for 
which l/d = 1. 

Values of load factor, U, for spheroids are plotted in Fig. 1. The 
volume of the spheroid is only two thirds that of a cylinder having 
the same I and d, so (11) becomes, for spheroids, 


T, _ 1.733 X 10°(B — Rs 
WH, p 


From solutions of (26) one may plot curves for normalized torque- 
weight ratio. In Fig. 9 we show a curve for spheroids of Alnico to- 
gether with a curve for cylinders of Alnico borrowed from Fig. 5. 
While the spheroids show a somewhat better torque-weight ratio 
than the cylinders, it is doubtful whether the advantage is sufficient 
to offset the added cost of shaping and mounting. 

The sphere might have unique advantages mounted in a spherical 
enclosure for friction damping. For the sphere of diameter D, one may 
rewrite (6) 





(26) 
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= 





° = 1.155 X 10°(B — H) 7 DRs . (27) 


= 


Dividing through by D® and inserting value of Rg, gives 


Pan = 0.81 x 10°(B — H). (28) 


This is an expression for the total normalized torque that can be 
packed into a specified spherical enclosure. Solutions of (28) for 
various magnet materials are collected in Table III. The moment of 
inertia of the sphere is 


M, = 0.1MD*? = 0.14 D*®. (29) 
6 g 


Combining (27) and (29) 


M.H, _ 0.1(r/6)D* p — _ _:150pD* (30) 
T, 1.155 X 10° °(r/4)(B — H)D*gRs (B—-— ADRs 


Combining (24) and (80) 


rVH 
ee es 24 Om B= DR, (31) 
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Fig. 9 Comparison, spheroidal and cylindrical magnets of Alnico 5. 
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TABLE IT] —Maanetic CHARACTERISTICS OF SPHERES 








To Po 7VHa 
Material p (B — H) WmHa H,D38 D 
Ceramic 0.15 3/00 3.81 X 107 3.0 * 1073 | 0.514 
Alnico 9 0.26 | 4600 2.73 X 10° | 3.74 X 1073 | 0.613 
Alnico 5 0.26 1900 1.18 K 10°? | 1.54 X& 107-3 | 0.955 


Equations (26), (28), and (31) define for the sphere the same 
normalized quantities which are plotted for the cylinder in Figs. 5, 7, 
and 8. Solutions of these equations for various magnet materials are 
listed in Table III. The combination of the table and the three figures 
gives all the information required to select the preferred material and 
geometry for a specified application and to arrive at a quantitative 
design of the magnet. 


VII. SATELLITE MAGNETIC MEASUREMENTS 


Satellites with spin stabilization introduce two magnetic measuring 
problems—measurement of “drag” and measurement of residual mo- 
ment. The “drag” results from eddy currents induced in the rotating 
metal shell of the satellite by the earth’s magnetic field. The energy 
dissipated in these eddy currents must be derived from the rotational 
energy of the satellite, and there results a decay of the spin rate. 
One wishes to evaluate the rate of this decay to forecast when the 
spin rate will fall below the minimum required for stability. The 
moment measurement is to detect any residual magnetic moment 
perpendicular to the axis of spin which will interact with the earth’s 
magnetic field to induce precession of the spin axis. After an accurate 
measurement this moment is canceled out by mounting in the satellite 
a small permanent magnet of equal moment and opposite polarity. 
Both measurements—drag and moment—can be made conveniently 
with a specially designed coil array. 


7.1 The Telstar® Coil Array 


The drag test requires a reasonably uniform field over the volume 
of the satellite. A paper analysis reveals that this can be provided 
by an array of coils of reasonable size with a particular distribution 
of ampere turns. Two coils, each of radius r, and of N turns are spaced 
-+r/4 from an assumed zero point on a common axis. Two other coils, 
each of radius r, and 7N/3 turns, are spaced +r from the assumed 
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zero along the common axis. The coils are connected in series to run 
at the same current so that the outer coils have effectively 7N/3 times 
as many ampere turns as the inner pair. The arrangement of coils and 
the resulting distribution of field along the axis are shown in Fig. 10. 

It was established by measurements that the region of uniform 
field extended out radially from the axis to include a spherical volume, 
the radius of which is roughly two thirds the radius of the coils. 
Hence, an array of coils five feet in diameter easily provided uniform 
field over the volume of the satellite. (If a conventional Helmholtz 
array were used the coils would have to be about 10 feet in diameter 
to achieve reasonably uniform field over the same volume.) This 
array was mounted on a turntable and rotated around the satellite 
which was supported by a calibrated torque suspension. From the 
result of this drag measurement it was possible to calculate the rate 
of decay of satellite spin in the earth’s magnetic field. 


7.2 Measurements of Magnetic Moment 


The magnetic moment perpendicular to the spin axis of the satellite 
was measured by rotating 1t within a coil array similar to the one 
used for drag tests except that the windings were connected to an inte- 
grating fluxmeter. One reasons intuitively that if a magnetic object is 
aligned parallel to the axis of the coils and rotated 180°, it will give a 
deflection of the integrating fluxmeter proportional to the moment, 
and that the proportionality constant will be unaffected by the position 
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Fig. 10 — Field distribution along axis of Telstar® array. 
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of the magnet in the array as long as it is within the volume in which 
the array produces uniform field. This intuitive reasoning has been 
confirmed by various measurements. The proportionality constant 
for the array is established by calibration with a small air-coil, of 
known NIA for which the moment can be calculated from (2). A 
two-to-one scale down of the array has proved to be convenient for 
bench measurements of magnetic moment of small magnetic objects. 
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APPENDIX 
Definition of Symbols 
The following letter symbols have been adopted for use in this paper. 


A = section area of magnet or winding. 

B = flux density. 

(B — H) = intrinsic induction, 

d = diameter of magnet. 

D = diameter of enclosing sphere. 

D; = demagnetizing factor. 

F, = aspect ratio factor, as defined in text. 

g = gravity (384 in/sec/sec). 

H = field strength in magnet. 

H, = ambient field, or field of interaction. 

H., = applied magnetizing field. 

l = length of magnet. 

M = mass. 

M, = moment of inertia. 

Mn = magnetic moment. 

N = ampere turns. 

Rs = shortening ratio, from recession of poles. 
LT = torque between magnet and perpendicular field. 
T = period of mechanical oscillation. 

U = load factor, reciprocal of demagnetizing factor. 
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Nonlinear Optical Coefficients 


By F. N. H. ROBINSON 
(Manuscript received January 18, 1967) 


We consider, from a number of different viewpoints, the tensor coefficients 
which describe second harmonic generation, optical rectification, and the 
Pockels or linear electro-optic effect in acentric crystals. Stationary per- 
turbation theory 1s used to calculate the low-frequency limit of the intrinsic 
electronic nonlinearity neglecting all effects due to local fields or lattice 
polarization. Solid methane is used as an example and the result used to 
estimate the coefficient in hexamethylene tetramine. The calculated result 
is within a factor of 2 of the expervmental figure. The method 1s susceptible 
lo further refinement and, since rt requires only a knowledge of ground 
state wave functions, and is essentially very simple, it appears to offer a 
useful approach to the calculation of the coefficients. 

The classical anharmonic oscillator model 1s briefly covered and the model 
is related to a quantal treatment. We find that the anharmonic potential 
used in the model is directly related to the actual crystalline potential. It 
can also be related to the charge distribution in the electronic ground state. 

Local field corrections and the effects of lattice polarization are presented. 
These alter the nonlinear properties in a sumple and obvious way, but one 
which has been misunderstood in some of the literature. 

Our results form a theoretical background to Muller's empirical rule 
relating the nonlinear coefficients to the linear susceptibilities. An extensive 
table of Muller-reduced tensor coefficients collated from the published litera- 
ture is presented. 

Finally, we draw together some of the threads of the previous sections. 
An appendix deals with the vexing question of definitions. 


I. INTRODUCTION 


Second harmonic generation, optical rectification, and the linear 
electro-optic effect are particular aspects of a process in which two 
fields, H’e’* and E7ve'™', generate a polarization 

Pie” Sate" he. (1) 
| 913 
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Our concern is with the tensor coefficients d%"." which (Nye’) necessarily 
vanish in centric (centrosymmetric) crystals and which, in acentric 
crystals, are subject to symmetry restrictions, (Kleinman’) which often 
leave only one or two independent components. 

Iixperimentally, the values of the allowed components of d in different 
materials and at different frequencies range from about 2.10°*° esu 
(cm/stat volt) to about 6.10°° esu. This range may be contrasted with 
the linear optical susceptibility ~ which is between 0.1 and 0.3 for the 
vast majority of materials and only quite exceptionally exceeds unity. 
There 1s, however, a connection between the tensor d and x which is 
expressed by an important empirical rule due to Miller.* If we write 
dh” as 

ty = XeexGXeAsa ’ (2) 
where x,° 1s the 227 component of x at a frequency a, and if we have chosen 
& principal axis system for x, then the allowed components of A;;, 
for all effects and all materials are similar in magnitude. We shall see 
in a later section that for very many materials in both the visible and 
10 w region of the spectrum (Patel*), A;;, is near 3 X 10°° esu. No 
materials with A above 20 X 10°° esu have yet been found and very 
few are known to have a value below 0.2 X 10°° esu. In the case of 
NH,H,PO, where the best measurements of s.h.g., optical rectification 
and the electro-optic effect are available (Francois,’ Ward,’ Carpenter’) 
the value of A,.; from all three effects is 3 X 10°° esu within the experi- 
mental error of 15 percent. The fact that s.h.g., a purely optical effect, 
leads to the same value of A as rectification and the electro-optic effect 
indicates quite clearly that the basic mechanism of the nonlinearities 
is common to all three effects and must therefore reside in the electronic 
motion of the system. In the next section, we shall concentrate on this 
aspect of the problem and neglect the effects of local fields and lattice 
polarization. 

A number of authors (see Section IV for references) have given quantal 
treatments of the optical nonlinearities whose end result 1s an expression 
for the coefficients d*?,” in terms of sums of rather inaccessible matrix 
elements. Useful as these expressions are, in establishing some of the 
general properties of the coefficients, they are not a practical step on the 
road to calculating the coefficients from other empirical quantities. At 
the other extreme, the classical anharmonic oscillator model has been 
used to demonstrate some of the qualitative features of nonlinear be- 
havior (see Section III). This treatment, though simple and appealing, 
suffers from the defect that the relation between the parameters of 
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the model and those of the real system is obscure. In Section IV, we 
shall remedy this defect and show that the two approaches are closely 
related. 

First, however, we give an approximate method of calculating the 
low-frequency limit of the coefficients from stationary perturbation 
theory in a form in which it has been successfully applied to the linear 
properties of n electron atoms (see, e.g., Dalgarno’). 


II. MAGNITUDE OF THE COEFFICIENTS 


At low frequencies, 1.e., well below any electronic resonance we can 
use stationary perturbation theory to calculate, to arbitrary order in 
the applied field # the energy W of the ground state. The polarization 
is then given by 

ow 
P,;=- al, (3) 
Weshall assume that we are dealing with a crystal containing N identical 
atoms or molecules in unit volume whose individual ground state energies 
are w, so that W = Nw. 

If H, is the Hamiltonian of an unperturbed molecule, its Hamiltonian 

in the field F is 


H=H,+h=H, — cH-R, (4) 


where 


n 


eR =e Dt” (5) 

m=1 
is the dipole moment operator of the molecule, and the sum extends over 
all n valence electrons. We can neglect the core electrons because of 
their high binding energies. If we expand w in increasing order in # as 


w= Wtwtwi+twu;, ete., (6) 


the term w, gives the permanent dipole moment of the molecule, w, 
gives the linear susceptibility and w; gives a polarization quadratic in 
FE which leads to the desired nonlinear coefficients. The electric field 
will perturb the state function y and we shall write the perturbed func- 
tion as either 


=pthtyt-:: or |) = (0)+ [I+ [24+ ---. |~ 


Knowledge of y or | ) to first order in # is sufficient to determine w, , wz , 
and ws; for 
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Ws = (0| h 10) 
we = (0| A |1) (8) 
w; = (1| h 1). 


Moreover, the correct value of y, or | 1) is determined by the require- 
ment that it minimize w. Thus, we can obtain | 1) by a variational 
procedure and the only element of choice left to us is that of the trial 
wave function. 

Minimizing w is equivalent (see Dalgarno and Lewis’) to the simpler 
problem of minimizing 


(1| H, |1) + 2(0| A |1), (9) 


where the notation H, or h means H, — (0| H, | 0) orh — (| h | 0). 
As a trial function, we take 


1) = A [0) (10) 





so that (9) becomes 


(0| hH.A |0) + 2X0] h? 0). (11) 








The minimization with respect to \ gives 


eae) () 


Ol A? fo) jo 





| ABA |0) ~~~ (0| AAA {0) 
The unperturbed Hamiltonian of the system is of the form 
__l Sy: 
A, 2 Im pe Wa, + V. (13) 
and so, in the denominator of (12), 
Hh = hl, + eB ie a (14) 
Thus, 
“ eh’ 
(0| bith 0) = ——= fy, DO Btw B-DVadedr, — (15) 


where dr is an element of configuration space and we have used H | 0) = 
0. Equation (15) can be written as 


2342 
(0| hh |0) = a u- | > Vn? Dd Ete dr, (16) 
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and integrated by parts, to give 
272 
(O| AHA |0) = +68 E-E SF v? dr, (17) 


where the discarded first integration part vanishes at the limits, if 
these are infinite, or if they are the boundary of a cell in periodic lattice, 
provided only that H does not vary appreciably within a cell (dipole 
approximation). 

If we are dealing with isolated atoms f ¥% dr = 1 and we have 


272 


eh 
no, Ee (18) 





(0| AHA |0) = I 
a somewhat unfamiliar form of the sum rule. If, on the other hand, 
we are dealing with overlapping molecules in a periodic lattice, the 
variational problem is to minimize the contribution to w from a single 
cell of the lattice. Thus, in (11) and all succeeding equations, the inte- 
grals implied by the expectation values are to be taken only over the 
interior of a cell. This will also apply to all integrals involved in evalua- 
ting w, = (1 |h| 0) and w, = (1 | # | 1). In this case (18) remains un- 
changed. This can be shown to be a general consequence of time reversal 
invariance and the commutation rule 


We now have 





11) = M4 |0) = + 2mol #10) h {0) (20) 


or 


 2me(O| (E.R) 0) R (0). (21) 


nh? E? 
From this we obtain the second-order energy 


__2 (E-f)*y 


22> 2 
NA, E , 


where a, = h’/me” = 0.53 A. If we let E = FE, ,0,0,andR = X,Y,Z 
this gives 


(22) 





(23) 
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and the atomic polarizability is 





4 (r2\2 
ame (As (24) 
For the H atom, this gives a = 4a} instead of the correct value 4.5a;, 
while for the helium atom, taking an effective nuclear charge Z = 
27/16 gives 1.8 X 10°*° ces. The experimental value is 2.1 X 10~*° ces. 
In general, (24) is a lower limit to a, if we evaluate (X’) correctly as the 
expectation value of the mean square moment of all the electrons. If 
the electrons are uncorrelated 


(X*) = nié’), (25) 


where (£’) refers to one electron. We used this procedure in helium since 
the two electrons are in orthogonal spin states and are automatically 
uncorrelated. In more complicated atoms correlation exists and almost 
always results in 


(X*) < n(é’) (26) 
since electrons repel each other. Thus, while (24) is a lower limit we 
cannot say anything about the sign of the error in 


eee (27) 


Qo 





We note, in passing, that, in a solid with overlapping molecules, a the 
polarizability is large. This leads to an element of instability in the situa- 
tion for as a increases the screening of the coulomb potential becomes 
more effective and the electrons less localized leading to a further in- 
crease in a and eventually metallic behavior. For this reason, most 
materials, which are not regular insulators, are metals. Those rare 
materials which have values of Na appreciably greater than 0.3 (n > 2.2) 
owe their existence to a rather delicate balance of forces. 
The third-order energy is 


2m 


= -AL)@HyER. 8) 


In most cases a is very nearly isotropic and we have 


LA 





ws = ( 





jot? = —w, = (225) w-2y'y (2 
so that 
ws = ——— (E-R)’). (30) 


NA,€ 
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With N molecules in unit volume this gives a nonlinear coefficient 
2S SS ey (31) 


where 
fie = (RR R,) a (RR;R,) _ (Ri: )(R;R,) ~~ (R;)(RLR;) 
— (Ri )RR;) + ARR KR). (82) 


Equation (31) is the central result of this section. It expresses d;;, in 
terms of the linear (corrected for local fields) susceptibility x and a 
cubic moment (third-order semi-invariant) of the electronic distribution 
in the ground state. 

If we neglect overlap and, for simplicity, also assume that the electrons 
are uncorrelated so that 7';;, = nt,;;, where ¢;;, refers to a single electron 
we have 


3 
i 5% a - bik (33) 


and 7';;, 1s now apart from numerical factors the octupole moment of 
the charge distribution. If the electron density in the molecule is p(r) 


Pits / | | tAPeplr) Er. (34) 


If we account for local fields through a Lorentz correction the correct 
value of x to insert in (33) is obtained from 





n—1 4 
na BX (35) 
and the observed value of d;;, (see Section V) is 
Vie as (" +2) Aiik - (36) 


At first sight (83) seems to imply that d is proportional to x in conflict 
with Miller’s rule. However, x depends on N/n(R?Y + nN’ and N is 
inversely proportional to (r’)? so that x = nr, while d = r*. Thus, d is 
in fact more nearly proportional to x* than x. 

We now consider as an example, the tetrahedral molecule methane 
CH, , which crystallizes in the tetrahedral space group F43m with a 
lattice parameter ~6 A and a molar volume of 32 ces. If we take Car- 
tesian axes along the sides of the cubic cell, the bonds point in the 111, 
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and tetrahedrally related, 111, 111, 111, directions. From symmetry, 
there is only one independent component of d;;, , in which all the sub- 
scripts are unequal. 

The shortest c-c distance is 4.2 A and from the size of the free molecule 
we conclude that overlap is unimportant. 

Turner, Saturno, Hauk and Parr’’ haveused one center wave functions 
to calculate the electronic density p in the molecule. F'rom this we can 
obtain f,.3 using (34). 

The result is 


ti, = 0.5 X 10°* em*® (37) 


and this is not very sensitive to the limits of integration. The experi- 
mental molar susceptibility of CH, is 1.6 ces and so x = 0.05. Thus, if 
we neglect correlations between the eight valence electrons we have 
from (33) 


0455 = 3 x 10°° esu. (38) 


In a similar way, neglecting correlations, we can calculate the molar 
susceptibility from (27). Turner et al’s charge density leads to 


G\ = 3.3 < 10° * cm" 


and so, with cight valence electrons, we obtain a = 6.5 X 10°” and 
x = a molar susceptibility of 3.9 ccs, rather over twice the experimental 
value. 

This is a clear indication that the electrons are correlated. However, 
the correlation enters twice in x but only once in d (since we have ex- 
pressed d in terms of the experimental x). Thus, d problaby lies between 
2x 10° and 3 X 107° esu. 

To see whether 3 X 107° esu is a reasonable value for d,23 we compute 
the Miller reduced tensor d/x* = Ayo3 = 24 X 107° esu. 

This is quite exceptionally high. Most materials have allowed com- 
ponents of A near 3 X 107° esu and only one coefficient in LiNbO; 
(9 X 10°) and Ajo; in hexamethylene tetramine (15 X 107°) approach 
this value. 

However, we believe it is in fact not far wrong. In most materials 
geometric factors conspire to reduce d by various factors of cos @ and 
the atomic groups are in the first instance less aspherical than CH, . 
In CH, the effects of every electron are directly additive. 

Hexamethylene tetramine (HMT), the other exception to Muiller’s 
rule, is, like methane, a tetrahedral molecule N,(CH.), in a tetrahedral 
143m crystal. The 4 nitrogen atoms form the 111 1II, 111, 111, corners 
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of a regular tetrahedron and the CH, groups occupy the edges but the 
N-C-N bonds are bent outwards in such a way that all the angles are 
very nearly tetrahedral. The carbon atoms occupy the six sites 2, 0, 0 
etc. (see Kitaigorodskii’’). 

The refractivity of the molecule as a whole can be very satisfactorily 
accounted for by a system of additive bond refractions. (See Lelevre’” 
for a review of bond refractions.) The three basic units are 12 C-—H 
bonds pointing in the 1, 1, 1, and related directions, 4 nonbonding orbitals 
on the nitrogen atoms pointing in the 111 and related directions, and 
12 N-C bonds in the 111 directions. 

Since the refractivities are additive, these units appear to act inde- 
pendently in determining the molar refractivity. Le Fevre (loc. cit.) gives 
values of R = (47/3)La (where L is Avagadro’s number) of 2.8 ces for 
each unbonded nitrogen pair, 1.65 ces for each C—H bond and 0.62 ces 
for each N-C bond. Thus, the N—C bonds make a rather small con- 
tribution to x, and probably even less to d since they have an approxi- 
mate inversion centre at the centre of the bond (C and N are similar 
atoms as compared with C and H). We therefore neglect them. 

The 12 CH bonds in the 111 direction are roughly equivalent to 3 
methane molecules in the molecular volume 105 ccs, and further the 
electrons will be less correlated than in methane. Thus, their contribution 
to dy23 18 


Me = —735-382.3 & 10°° = —2.7 X 107° esu. (39) 


To calculate the effect of the nonbonding nitrogen electrons we assume 
that they occupy SP* hybrid orbitals directed along 111 etc. with Slater 
radial wave functions Ar exp (—2.5r/2a,). It is then straightforward 
to show that for one electron 


tive = —0.055 X 10°" em*. (40) 
The contribution of the 4 nitrogen atoms to x is 


vy 34X28 


“ 4r 105 eae 


and so 
dio3 = —1.7 X 107° esu. (41) 


Thus, the total value of dis; is —4.5 X 10°-° esu. This could be slightly 
increased by the effects of atomic overlap, and possibly by contributions 
from the N-C bonds. It could be either increased or decreased by elec- 
tron correlations on individual CH, groups. The experimental values 
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for the electro-optic effect Heilmeyer’® and second harmonic generation, 
Heilmeyer, Ockman, Braunstein, and Kramer,’* when corrected for local 
field effects using a Lorentz factor, both give d = +8.2 X 10°” esu. 
Thus, our calculation is within a factor 2 of the observed value. 

This method is therefore capable, in simple cases, of predicting the 
magnitude of d rather successfully. Moreover, the experimental value 
of d for HMT suggests that we were correct in assuming that CH, 
will have an anomalously large reduced tensor Aj); . 

The fact that the division of a complex molecule such as HMT into 
simple components leads to a reasonable value for d leads us to hope 
that a similar procedure will be possible in other cases. It might then 
be possible to assign empirical values of d to basic components such as 
the C-H bond or the N: nonbonding pair, and to combine these ad- 
ditively (with a proper attention to geometry) to predict the values of 
d for even more complex molecules. This would not be surprising since 
a similar procedure (see LeFevre loc. cit.) works very satisfactorily for 
the linear susceptibilities. 

It is then obvious that large nonlinear effects will only result, if the 
molecule contains polarizable groups disposed in an arrangement which 
results in a ground state of, far from inversion, symmetry. The large 
value of A in HMT results from the fortunate coincidence that the 
most polarizable components are themselves strongly asymmetric and 
so oriented that their effects are additive. The much smaller values of 
A commonly observed can then be explained as partly due to no group 
in the crystal being quite so asymmetric as N: or CH. in HMT and 
partly due to unfavorable geometric relations between the groups. I’or 
example, if our approach is correct we should expect the analogous 
compound adamantane (CH),(CH:), in which the nitrogens are re- 
placed by CH groups with the CH bond along 111, etc. to have a djo3 
appropriate to 2 (=3 — 1) CH, molecules in 105 ccs, ie., di23 2 2 X 107° 
esu or about half the value for HMT. 

Exceptionally small values of A will occur in materials where most 
of the molecule possesses local inversion symmetry, so that only a 
fraction of the molecule contributes to d, while the whole molecule 
contributes to x. We shall consider an example of this in a later section. 

Overlap between adjacent molecules is neccessarily bound to lend 
further uncertainty to the calculation in materials with a pronounced 
band structure, but it seems possible that rough approximations should 
be obtainable from, for example, the relation between bandgap and 
the corresponding separation in the isolated atoms. In fact, since what 
we actually require is 7',;;,/n, which, if the electrons are uncorrelated, 
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is simply 


Cell 
T.. | Pf. pL) d’r 
tik © ______ (42) 


nN Cell 
/ p(t) a'r 


we may expect that this factor will, to some extent, be self-cancelling. 

Finally, we may remark that very much better approximations to 
d;;, can obviously be made if we know the ground state wave function 
explicitly and also use more sophisticated trial wave functions in the 
variational calculation. It is at first sight surprising that a knowledge 
of the ground state wave function alone is sufficient to determine x and 
d which, in the more usual treatments involve the properties of excited 
states. However, we should remember that a knowledge of the exact 
ground state wave function is, except in pathological circumstances, 
sufficient to determine the unperturbed Hamiltonian; thus, the whole 
spectrum of states. 





III. THE CLASSICAL ANHARMONIC OSCILLATOR MODEL 


Although the considerations of the preceding section are sufficient 
to determine the magnitude of d at low frequencies, they offer little 
guide to the variation of d with frequency and, if recast in terms of time 
dependent perturbation theory they tend to lose their attractive sim- 
plicity. In the next section we shall show that a more familiar form of 
time dependent theory leads to results which can be represented in 
terms of a classical anharmonic oscillator model. Here, we discuss the 
properties of the model itself. 

We assume that unit volume of the material contains N optical elec- 
trons which move in a potential 


V = 3mQja; + V i jnUsXjXy, , (43) 
where a sum over repeated subscripts is implied. The potential V;;, 


obviously satisfies Vj, = Viz; , ete. 
In a field H%e"** the equation of motion is 





#, + Of; + 3 His v2, = — Ee" (44) 


and the linear response obtained by neglecting V;;, is 


£ Fpheit 
m 4 
a _ ee (45) 
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There will be a similar response to a field H%e'”’ and, if we introduce 
these responses back into the nonlinear term in (44) we obtain a re- 
sponse at the sum frequency a = 6 + ¥y given by 

HOY sacs BV ine! ee ee ee ee 


—_— Ne ae 


! SOO or GE FGF a WET + Baie. (46) 


) 


The resulting polarization is Nex{” and so the nonlinear coefficient is 


3V 55,Ne" 1 1 1 
oe faeearaae | 
Thus, the symmetry of d;;, mimics that of V;;, if we neglect the res- 
onance denominators. | 

The linear susceptibility obtained from (45) is the familiar expression 
» Ne 1 


Rais 2 2 
“A m Q; — w 


(48) 
and so if we express d**7 as 

ae = xaxeXte ise (49) 
the reduced Miller tensor is 


OV asp 


Aix = ~~ N08 , 





(50) 


which is frequency independent and has the same symmetry as V,;,. 

If we assume that V;;, is electrostatic in origin its order of magnitude 
will be e’/d* where d is an atomic spacing and we shall also have Nd’ ~ 1. 
Thus, 


S. 
ho 


| Ass | 3 ee (51) 


With d equal to 2 A this is 2.5 X 107° esu, about the mean value of A 
for most materials. In a later section, we shall give another estimate of A. 

The potential V;;,v;7;2, distorts the shape of the ground state of 
the harmonic oscillator and as a result the system acquires a cubic 
moment ¢;;, [defined in (82)] which we now calculate. 

Let | 0) represent the unperturbed ground state wave function in 
the absence of the anharmonic term and | 7p) be an excited state, then 
the perturbed wave function is 


UF eel) eae a Ip). (52) 
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The expectation values of even operators such as (27), (v;a7;) are un- 
altered by V, while the expectation value of an odd operator such as 
XU; OY ©;2;X, 18 given by 


(x, jp) bat, PB. O] Lj, lpXp V |0) (53) 

Dp hw, 
It will suffice if we calculate t,;;, with ¢ # 7 # k. Since (;7;) = 0 if 
t ~ i we only require (7;%;x7,) and contributions to this come only from 
the 6 = 3! terms in V with2 + 7 ¥ k. The only state which contributes 
to the sum is |p) = | 1, 1, 1) with an energy A(Q, + Q, + Q;). The matrix 





element is 
_ a 1 ) 
(0| saat, (111) = (5) (Saw. 
and so 
A, ; V i283 
= = —12(-——-} -—-——_.———_-: 
bios (1122s) 12(5 hQ{2203(Q; + Q2 + 05) 


It is straightforward to show that a similar result 


_ _3 Dee 
ft ane (2 AQ,2;%(Q; + 2; + O,) (54) 


holds for all the components of ¢;,; . 
If we substitute this relation in (47) and take the limit as afy — 0 
we obtain 


5) 
dij, = 2 rs bie (55) 


This is twice the value obtained in (83) because there we treated {,;, 
as a fixed property of the ground state which was then perturbed by E; 
whereas here we have considered an even ground state perturbed by # 
and a fixed potential. 

Thus, if the real system has a cubic moment #¢;;, in the ground state, 
the equivalent anharmonic oscillator model requires an anharmonic 
potential 

3 a4 
tik = ee bs in (56) 
and this will result in a cubic moment ¢/,, = 3¢;;, in the oscillator ground 
state. 

In the real crystal ¢;;, may be an accessible quantity. It obviously is 
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in molecular crystals of strongly covalent compounds such as CH,. 
But, in ionic crystals it may be more sensible to consider the ions as 
spheres perturbed by a crystal potential V‘,;,. In a later section we 
shall see that there is a simple relation between the model potential 
and V¢;, - 

The classical anharmonic oscillator model has previously been used 
by Bloembergen,’’ Garrett and Robinson”® and Kurtz’’ to give a qualita- 
tive account of nonlinear phenomena. The latter authors also discuss 
in some detail its relation to Miller’s rule. 

Obviously, the model is the nonlinear analogue of the classical har- 
monic oscillator model used with such success for the last 60 years in 
the discussion of linear behavior such as dispersion, and, just as the 
harmonic model is directly related to the results of a quantum mechan- 
ical treatment, we may expect the anharmonic oscillator to have a 
similar basis. In the next section we explore this relation. 


IV. TIME DEPENDENT QUANTAL TREATMENT 


A number of authors Bloembergen,’’ Armstrong, Bloembergen, 


Ducuing and Pershan,” Butcher and McLean,’® Kelley,’® Cheng and 
Miller”’ and Ward” have given rigorous quantal treatments of optical 
nonlinearities in solids. We select an expression due to Armstrong, 
et al (loc. cit.) which expresses the nonlinear coefficients in terms of the 
energies fw, of excited states and the matrix elements (0 | 2; | 7), 
(p | x; | q), ete. of the dipole operator between states. The ground state 
is (0 |. 

This expression is valid, cither for an assembly of N isolated atoms 
in unit volume or, in the dipole approximation, for a real solid where 
the wave functions overlap. In the latter case, the solid must be divided 
into cells of the periodic lattice, and N is then the density of cells, while 
the matrix elements are to be evaluated only over the interior of a cell. 
The periodicity of the lattice ensures that contributions from parts of 
the wave function outside a cell cancel in the crystal as a whole. 

To avoid a plethora of subscripts we let each of x, y, and z serve to 
represent any one of the components and we can then write the expres- 
sion for d as 


Ne’® { WW, + ay 
ace? _ s op! ; pee ORO FE 
YZ he a bS v eU pa*a (oi, - a) (ed, = y) 


Pp g 





WW + Ba Oe Oak (° ae 
+ natn Ge SEE Gy t tate ETE OD 


Wp — ¥ 
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This expression vanishes if the states | 0), | p), etc. have adefinite parity, 
its value therefore depends on the existence of matrix elements whose 
presence is contingent on the absence of inversion symmetry. For this 
reason, it 1s almost impossible to make an informed guess about its 
magnitude or behavior. 

An analogous expression for the linear susceptibility is 





2 
a i gang (58) 
and in both expressions an operator x is to be understood as the total 
operator for the contents of a cell, i.e., the sum of the individual electron 
operators. Of course, we can neglect the core (nonvalence) electrons on 
the grounds that they are too tightly bound to contribute to the optical 
properties. 

A familiar approximation to x is obtained if we note that in (58) the 
variation of the summand with | p) is almost exclusively due to the 
matrix elements. These not only obey selection rules, but also decrease 
rapidly in magnitude as the state | ») increases in energy, and therefore 
overlaps the ground state less and less. For example, in the H atom with 
a 1S ground state the matrix element x,, vanishes unless p is one of 
the states 2P, 3P, etc. Moreover, as we go from the 2P state to the 8P 
state x,,7,, decreases by over a hundredfold. At the same time, w, 
changes by less than 30 percent. Thus, except near a resonance, we can 
treat w, as a constant 2, somewhere near the first allowed transition and 
write (3.2) as 


- _ 2Ne QD _ Q 
Key = n(@? — w’) 5 2 XopY no (59) 


where the primed sum excludes p = 0. Now 
De’ ZosYv0 = Dy Lose — LocYoo = (LY)oo — LooYoo 
= ((« — (x))(y — (y)) (60) 
where a ( ) denotes a ground state expectation value. Thus, 


ge gs oA (ie — Wy — @)), (61) 


We shall not pursue the further manipulations of (61) using the sum 
rule which lead back to (27) but we remark that in many cases a form 
such as (61) for x, involving a single Sellmeier or classical oscillator 
term, gives an excellent account of optical dispersion, and that when 
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applied to the hydrogen atom, with iQ set equal to 2e’/a, the 1S — 2P 
energy, it leads to a value of x at low frequencies 


a Ee 
=. “3 Oo 


which exceeds the correct value 4.5 a} by 32/27 or 18 percent. 

Before we can adopt a similar procedure with the nonlinear coefficient 
we must first satisfy ourselves that there is no essential difference be- 
tween a sum with three matrix clements and one with two. In x all 
matrix elements terminate on | 0) but in d it is quite possible in a term 
such a8 %o,Ype2ao » With p gq corresponding to highly excited states, of 
great spatial extent, that the term y,, may be large enough to compensate 
for the smallness of x,,2,. . If this were the case it would be possible for 
the exact value of the sum to depend critically on cancellations between 
large terms involving highly excited states. The removal of the frequen- 
cles w, , etc. as a single average would then have disastrous results on 
the sum. 

We will advance three arguments why this is unlikely. Consider first 
an even higher-order calculation, that of the fourth-order Stark shift 
of the ground state of atomic hydrogen due to a field F. In atomic 
units this is given by an exact calculation (Dalgarno’’) as 


W? = —33359* ~ —56F". (62) 
We can also express W“ (Dalgarno, loc. cit.) as 
? 
Ww? er — > ye 4 L opt pnglar& ro a by Lon po a come Pe. 
p q r WWW> p Wp Wy 


Our procedure treats w,w, and w, as a single constant 2 and leads to 


w® = Ee — @)) — Xe — (ey). (63) 


For the H atom (xz) = 0, (v7) = 1 a.u. and (z*) = 9/2 a.u. so that, if 
we set 2 = 3/8 a.u., the 1S — 2P energy difference 


W® = —1289R* ~ —48F", (64) 


This is close to the correct result (36), but despite the fact that we have 
taken the lowest possible value of © it is too small. This is a clear indica- 
tion that some cancellation of higher terms, which we have aggravated 
by our cavalier treatment of w, , etc., is occurring. This is not surprising 
for, if in the triple sum we consider the lowest possible sequence of levels 
1S 2P 2S 2P 18 for which wo, = o, = w, = Q the product of the matrix 
elements is 5 a.u. while for the sequence 1S 8P 8S 8P 1S the product 
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is 10 a.u. made up of a contribution of 0.0033 from the two 1S 8P 
elements and 3000 from the 8S 8P elements. 

However, this is not quite so serious as it appears, for in a real solid 
no matrix element can exceed the linear dimensions of a cell say 5 a.u. 
so that the product in the low transition would remain at 5 a.u. while 
the product for the upper transition would be reduced to 0.08. 

Our final argument is empirical. If cancellations between large terms 
are critically important, the relevant feature of our procedure is the 
change in the ratio w,/w, it causes for highly excited neighboring states. 
In hydrogen the ratio of the 1S — 8P energy to the 1S — 7P energy 
is 1.005 and we replace this by unity. In a time dependent theory res- 
onance denominators appear, and if the sum is really so critically 
balanced, we expect the observed quantity, in this case the hyper- 
polarizability, to vary rapidly with frequency when » ~ 0.005 0”, 
i.e., at a frequency 10 times lower than the first absorption edge. In 
nonlinear optics, no such variation is observed until one of the frequen- 
cles approaches much more closely (about 70 percent) to the absorption 
edge (Chang, Ducuing, and Bloembergen”*). 

Taken together these arguments give us reasonable grounds for 
hoping that the sums will not bite us if we remove w, , etc. from under 
the summation sign. 

In the sum in (57) there is no restriction on 7 or q, in particular terms 
with either p = 0 or g = 0 occur. These will lead to trouble if we attempt 
to approximate the sums as they stand. We therefore first segregate 
all such terms. Let { } denote the entire summand in (57), then 


Da ee es a ee 


W, —~ VY Yor Wr — Q@ 
a Zortro _ B oe ATi ra 
Yoo B pie: ws __ eine vidos 5 2 — Bp 
8 Ure ye cseoeeee : 
Zoo 7 00 i eee 65 
a yea pt ued Ei fet (65) 


Three single sums remain unprimed, but because a = 8 + vy the terms 
with r = 0 cancel and so we may regard all the sums as primed. 

We can now remove w, , w, and w, as a single average Q, and this leads 
to an expression containing terms such as 


et yay LopYna ao = > 3 LopYngqo — Xoo DS Yor*ro 
Pp a p qa r 


oy. >. Eid) an 2a nek 
T 
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Each of the sums on the right is now a ground state expectation value. 
When all the terms are collected together we obtain 


ditt = Te ae OOP a lane) — (a)ue) — Wer) 


— (ery) + 2AaXyr2)} (66) 


which we can also write as 
3 
asey = 7S D(2, a, B, 2) (67) 


in terms of the, by now, familiar cubic moment 7’,,,. This expression 
bears an obvious resemblance to (61) for x. 

Our expression (66) or (67) would be very nearly exact if all the optical 
levels had very nearly the same energy. It would then correspond to 
the fictitious two level system (see Refs. 15, 16, 18) often used to oblit- 
erate some of the intractable features of (57). Unlike this model, how- 
ever, our expression retains the geometry of the system implicit in the 
selection and sum rules. 

Equation (66) is possibly valid up to a frequency where one of a, 8, 
or y approaches the first allowed transition frequency. At somewhat 
lower frequencies, it is legitimate to drop the term By — a’ in the numer- 
ator. This then allows us to make a further generalization at no increase 
in complexity. 

By removing w, and w, from (57) as a single average we have tacitly 
neglected the possibility that the system might be birefringent. We 
can remedy this by noting that in (57) each frequency , or a, 
is uniquely associated with a matrix element such as x,, or 2,. which 
terminates on the ground state | 0) and therefore also appears in x. 
Thus, we can consistently introduce three averages 2, , 2, , and Q, as- 
sociated with correspondingly polarized transitions. If we follow this 
process through all its tedious ramifications, we find that, except in 
the term By — a’ which we are omitting, it leads to the surprisingly 
simple result that D(Q,a,8,7) is replaced by 


2,2,9,(2, + 2, + 2) 


DQ,a, 8. = Cor A — BQ? — 7) 





(68) 


Thus, 


SW GF (9 — ENG — 7) 








geoy _ Ne’ 2,.%9,(0, + 2, + 2) ip 





(69) 
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If we compare this with the result for the classical anharmonic oscillator 
obtained by combining (54) with (47) we see that they are identical 
except for a factor 2 which once again arises because in one case we 
assumed that T,,, was a fixed parameter while in the other it was V,,, . 

We now see that the classical model is equivalent to the quantal 
treatment, except near a resonance, in the following sense. 

If we construct the model, by choosing Q, , 2, , and 2, to give the 
correct linear properties then we must choose the anharmonic term in 
the potential to produce a cubic moment in the ground state of the model 
equal to $ the corresponding moment in the real system. The dynamical 
properties of the two systems are then equivalent and the model can 
be used to treat more complicated systems where the quantal treatment 
is too difficult. 

We now consider the relation of V/,, to the actual potential respon- 
sible for the existence of 7',,,. Obviously, the relation is obtained by 
requiring that both potentials yield the same cubic moment, one in the 
model, the other in the real system. In this case, there will be no factor 
of 2. 

For simplicity, we consider only a system which is isotropic before 
the application of the anharmonic potential. Further, we restrict our- 
selves to atoms in which there is only one valence electron. Our results 
will, however, be directly applicable to atoms with more electrons if 
we can neglect electron correlations. 

We already have an expression for the oscillator (54) which we can 
write as 
Vou 


iM 


ryy 6 
Lj, = —dta ea 


(70) 


where w, , the classical frequency, also corresponds to the first allowed 


transition, and 
h ) = 
aor (5A (1) 


is a measure of the extent of the system in one dimension. The direct 

proportionality between 7';;, and the corresponding component of 

Vij, occurs because the oscillator Schrédinger equation separates in 

Cartesian coordinates. In general, as we shall show, it will only hold if 

V = Vi;.0;2;2,, the crystal potential, satisfies Laplace’s equation. 
We will consider a more general potential of the form 


VS 2, Vir’ P(8, @), (72) 
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where P’?} is an associated Legendre polynomial normalized to unity. 
This potential satisfies Laplace’s equation only if n = lL. 

If the unperturbed ground state wave function is Y the first-order 
correction y, due to V satisfies 


(H, — Hwy + (V — Ly) = 0. 


Since 7 is an odd moment we need only consider odd terms in V (in 
fact only / = 1 andl = 3) and for these /, vanishes since yp has definite 


parity. 
We let 
vi = flo (73) 
and then 
(Hy — Eo)fvo = —Vo 
but, since | 
Ho -* Wa 
this leads to 
Vif + 2VP-V log vo = V. (7-4) 


Now vy 1s a function of r alone and so we can write 


f= > Vran(rP%(8, 9), (75) 


nim 


where a, (7), which does not depend on m, satisfies 


ae 2 Sore a ae 


10.,0a0 IUi+1) da dO log Wo 2m , ee 
: pe ye Ne ets ila ce 
rar! ar aes a ae ie oo 


The perturbed ground state is therefore, 
b= {1+ DO Vian()P1(8, 9) } Po(r) (77) 


nim 


and in this new ground state we can easily evaluate expectation values 
such as 


Px") — Ss Virbonr ’ (78) 


where 


a I Pan) vyr dr. (79) 
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In evaluating 7;;, we shall need (z;), (x7), and (x;7,;2,). The even 
moments are unchanged by V and we obtain the odd moments by ex- 
panding x; and x;2;x, in terms of Legendre polynomials and powers 
of r. 

We omit most of the gruesome details of the calculation, and further 
restrict V to contain only terms of the type 


V = Vig aujyx, + Xia; = pe Var Ps + Var'Pr + VinrPy. (80) 


The term in V;} which does not satisfy Laplace’s equation is necessary 
to obtain the most general form of the cubic part of the potential 
V ;;42%,2;2, . This contains 10 independent parameters while P’} has only 
7. The missing 3 are supplied by P’. 

If this term is absent, we have 


vy Vv =0 (81) 
and then 
S; = Vea t+ Viy7 + View = 0. (82) 
With all the terms present we obtain 
(t;) = $BisiS: + FBX; (83) 
(vi) = s5Bsa3V ics + 3{ 258331 — 1758aa3}S; + 26arX; (84) 
(2:23) = s5BssxViz; + {'sBss1 — T75Bs08}S; + wsBsrX; (85) 


(x 5:2 ;2) — 35BesnV eis ) (86) 
where it is to be understood that 7 # j ¥ k. 
If we let 
Y = (i), (87) 


then since (v;7;) = 0, 7 ¥ 7, and (v;)(2;){z,) is third order in V, we 
obtain | 


T i3; = F5Bss3V ii; + 3958321 — T75Bsa3 — 27Biaf S; 
+ 3{¥sBa1 — 37¥BinfX; (88) 
Tig = 35Bs22V ii; + {2°s83:1 — 758333 — FyBia} S; 
+ {58s — 3yBin}X; — (89) 
T isx = B5B333V ize - (90) 
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For a harmonic oscillator 











6 - ae 
a _ a a 
Bass = —35 hw,’ Bas: = —80 ho,’ Bsi1 = —109 is 
4 4 ; 
— - o af 91 
Biss = —* hw, ’ Bisi = —1od hus, ' Bin = —3 his. ( ( ) 
y=a 


and it is easy to check that the coefficients of S; and X,; vanish, so that 
we recover (70). 

If V satisfies Laplace’s equation S; = O and in the absence of an 
internal field X; every component is given by 


T 3% oa 35B333 V ain : (92) 


Thus, in this case 7';;, and the reduced Miller tensor A;;, have the same 
symmetry as V;,;,. Therefore, since S; = 0 we have 


Ais + Ai; + Aix, = O. (93) 


If 2 is an axis of 3-fold or higher symmetry, A;;; = Aj, and so, for ex- 
ample, 


Ass3 = —2Ag11 . (94) 
This relation is rather well obeyed by the coefficients for the 6-mm 


crystals listed in the Table I. Signs are available only for the electro-optic 


TABLE [| 





Material Wavelength yu A333 X 106 esu Asan X 106 esu Ratio 





Linear Electro-optic 


ZnO optical 1.5 —0.8 —2.1 
Zns optical 0.9 —().45 —2.( 
CdS optical L22 —(0).55 —2.2 
Second Harmonic 
ZnO 1.06 o.0 1.1 +3.0 
Zn8 10.6 4.9 2.45 +2.0 
CdS 1.06 3.2 1.6 +2.0 
CdS 10.6 5.4 3.0 +1.6 
CdSe 10.6 4.8 2.4 +2.0 
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coefficients and so the s.h.g. results represent moduli only. References 
to the experimental data are given in conjunction with later tables. 
In the case of the electro-optic data, the experimental figure is for Aj13 
and we have assumed that Kleinman’s rule (Kleinman’) holds and that 
this 1s equal to A;,,. Except for s.h.g. in ZnO the ratio is —2 within 
the experimental error. 

If, on the other hand, the sole perturbation in V is the field X; , we 
have 


T5355 = 3D 553 _ 24 Bai =~ 578111} (95) 


and the expected ratio is +3. In crystals where both terms occur in 
V with arbitrary strength, any value of the ratio is possible. This is 
observed in the ferro-electric crystals BaTiO, ratio +4 and LiNbO, 
where it is +1.7 for the electro-optic effect and +11 for s.h.g. It is 
perhaps somewhat surprising that the ratio is so exactly —2 in the 
6-mm crystals since this is a polar point group and an internal field 
X3 1s not forbidden by symmetry. 

If V does not satisfy Laplace’s equation, (it need only satisfy Poisson’s 
equation) there is no direct relation between the components of T,;, 
and those of V;;, even in the absence of a field X;, although, since 
xyz 1s a spherical harmonic, we still have 


T 123 = 358333 Vins : (96) 
However, since the coefficients of S; vanish for the harmonic oscillator 


we may expect them to be small in other cases. We gain some support 
for this view by considering the hydrogen atom for which 





Sea — (209)" #2 _ _ 35385 a, _ 1485 ap 
333° 77 g ho, ’ 331 77 256 hw, ’ 31h 7 64 ho, 
Bg = 803 a og = 8 ag BL, (97) 
133 128 ho, ’ is 16 ha, ’ 111 7 39 hoo, 
y= a, 


where as usual a, = hi’ /me’ and hw, = 3/8(e'/a,) is the first allowed 
transition (1S — 2P) energy. 

The coefficient of V;;, in each term of 7';;, is then —315/16(a/ha,) 
while the coefficient of S; in 7';;; is a factor 23/200 smaller. In 7;;; it 
is 23/600 smaller. Thus, in the absence of X; the non-Laplacean terms 
in V cause no more than an 11 percent departure from the relation 


315 a, 


Vin = — 76 he, 





V; jis (98) 
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Since we expect the dominant terms in V to satisfy Laplace’s equation 
it appears that 7';;,, A.;, and the model potential V/,, will be very 
nearly proportional to the corresponding terms in V. 

The potential V’ required in the model is related to the crystal 
potential by 

Bsa Vii = B333 Visin . (99) 

Tor a hydrogen atom this gives V/,, ~ 5V;;, thus, insofar as real atoms 
behave like hydrogen atoms, a model with the same spatial extent 
a = a, and the same first allowed transition w,  w, will require a 
potential roughly five times as strong as the actual potential. This 
reflects the obvious fact that a harmonic oscillator is a stiffer system 
with more sharply localized (Y ~ e7"’) wave functions than an atom 
(y ye"). 

We have now shown that, with an appropriate choice of parameters 
a classical anharmonic oscillator model is a very good approximation 
to the intrinsic electronic nonlinearities of real systems. 

In the next section, we use the model to consider the effect of lattice 
polarizability which we have so far neglected. 


V. LOCAL FIELDS AND LATTICE POLARIZATION 


We have already remarked in the introduction that the seat of the 
nonlinearities resides in the electronic motion. It is, however, consider- 
ably modified by local field corrections and in the case of optical rectifica- 
tion and the linear electro-optic effect by lattice polarization. 

Miller’s rule states that d?{" is proportional to the product of the 
observed linear susceptibilities X 4 , etc. at the appropriate frequencies. 
If one of these is de we are to take the actual de susceptibility and not 
the extrapolated long wavelength limit of the optical susceptibility. 

At first sight, 1t seems plausible that this is simply the effect of in- 
ternal fields, which cause the local field experienced by an atom to be 
greater than the applied field. We now examine this hypothesis and 
show that it is inadequate. 

Microscopic calculations yield the polarization of single atoms due to 
local fields. In the linear case, if we have N atoms per unit volume of 
polarizability a 


and the local field is related to the applied field H by 
E,= H+ TP. (101) 
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In some cases the Lorentz value of ! = 47/3 is applicable and we then 
obtain the well-known relation between the refractive index n, or the 
dielectric constant « = n’ and a. 





5 = Ne == (102) 


where V is the molar volume and F is the molar refractivity. 
In general, 





P = Na(E + TP) = SS E (103) 
and the observed susceptibility is 
X= = = a Na (104) 
while 
= (1+ TIy)H = oe (105) 
1 — I'Na 


In nonlinear optics the two driving fields Ef and E% are obviously 
modified according to (105) but, as Armstrong, Bloembergen, Ducuing 
and Pershan ** have shown, there is a further factor in P. This arises 
because the nonlinear polarization 


= die (ED toca (EB local ’ (106) 


produced directly on the atoms, further polarizes the surrounding 
medium. 
We have 
PY = py + IT NoP?, (107) 
so that 
a Dp; a a 
P= 1 — TNa = (1 + Ixii)pe . (108) 


Thus, if d?%7 is the (calculated) intrinsic coefficient, the observed 


wk 


coefficient is 





Di? = A+ Tx + Pxhd + Pxtdie. (109) 


‘Therefore, even if d does not vary with x, D will do so. ‘This is, however, 
not enough to explain the observed variation of D with x. For example, 
in semiconductors it is very likely that IT’ is small if not zero and yet 
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the measured values of D appear to obey Miller’s rule and be propor- 
tional to x°. Thus, the intrinsic coefficient d itself must have a similar 
dependence on_X, 

If we write 


a — KiXXr Acie (1 10) 


in terms of the measured susceptibilities (i.e., n° — 1), which is the con- 
tent of Miller’s rule, and then use (104) to express D in terms of the 
atomic polarizabilities we obtain 


Die = 14+ Txt) + Px) + Txt) N afiah oh Ais (111) 
so that from (109) 
ae =a NoiNog Nand si . (112) 


Thus, the reduced tensor is the same whether or not we apply local 
field corrections as long as we do it consistently. To obtain a more or 
less constant value of A we must have d varying as a’. 

Since A for NH,H,PO, derived from the purely optical s.h.g. effect 
agrees with A from the quasi-static electro-optic effect to within 10 per- 
cent, although the values of d differ by a factor of 12 and in BaTiO, 
the two values of A3,, are within a factor 2 while the d’s differ by 300 
it is clear that lattice polarization has a direct effect in d not described 
by local field terms. 

We repeat that optical nonlinearities have an electronic origin. 
Electrons in atoms do not move in a harmonic potential. Second har- 
monic generation, which can only involve electronic motion, is much 
the same in covalent organic materials, ionic crystals and ferro-electrics. 
Large values of d’* are associated exclusively with large refractive in- 
dices. Thus, nonlinearities in the ionic motion play a secondary role 
in nonlinear optics; however important they may be in determining 
the ferro-electric properties. 

We shall attempt to construct a model, just sufficiently general to 
exhibit the gross features of ferro-electric behavior, and show that it 
modifies the nonlinear optical behavior exactly as predicted by Miller’s 
rule. The model is not put forward as an explanation of ferro-electricity 
although it has a venerable past in that connection, but as a demonstra- 
tion that a simple system with singular dielectric properties behaves 
in a way consistent with Miller’s rule. 

In Fig. 1, we illustrate a moderately realistic one-dimensional model 
in which electrons of mass m are coupled to ions of mass JM in a lattice. 
Forces act between like and unlike particles and of these by far the 


NONLINEAR OPTICAL COEFFICIENTS 939 


m Kmm mM Kmm m 





KM M Kam | M 
Fig. 1— A realistic one-dimensional model. 


strongest 1s K,,,,, which is responsible for the electronic optical spectrum. 
The remaining forces determine the lattice spectrum. The important 
nonlinearities are associated with K,,,,. The linear behavior of this 
model is formidably complicated and we therefore assume that its 
salient features are already evident in the much simpler model of Fig. 2. 

The electron of mass m, = m is coupled to the ion of mass mz = M 
by the force constant k,. which replaces K,,. . It is anharmonic. The 
electron and the ion are also coupled to rigid supports representing the 
rest of the crystal by forces k, and k, . It is as though we had gone di- 
rectly from the Born-Von Karman theory of specific heats to the Kin- 
stein theory without mentioning Debye. 

Let x, be the displacement of the electron of charge e, and z, that of 
the ion of charge e, . We shall assume that the potential energy is 


p= $hyay at thot zie ahio(t1 i= t)” Ie Viol2 = ey (113) 


so that the anharmonic term is exclusively associated with the ‘“‘atomic”’ 
binding of the electron to its parent ion. It will be convenient to define 
Vo, = —Vi2. The equation of motion in a field H’e*”’ is then 


mit: + hy: + hola; — xv) + 80;;(2; — 2,’ = ete (114) 
and the linear response neglecting v,; is 


AY 2: (k; = m;B Je; + hye, + €.) 


— Efe", (115 
/ (k, + kis ca mB’) (ke + is — MB") = ki. € ( ) 


With NV units in unit volume, the polarization is 


PP = NGS, Pes) 
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def 


LZ 









Mg 


Fig. 2— A simplified one-dimensional model. 


and so 


ei(ke m= m2») + e3(k, — m,(°) + ke + €o) 
(Kk, + kuz - mB’ )(ke + hig — m2’) = ki 


At an optical frequency w well above (k2/mz)? the ionic resonance 


; Ne} 
Xx ~~ 2y 
ky +- key — MW 








x =N (116) 


(117) 


while at de 


kigei + yes + Iro(er + 2) 
kiko + Kyoky + Kyoko 


To obtain the sum frequency polarization due to two fields E’e 
and E7e*’’’ we substitute the linear responses back into the nonlinear 
term in (114). The result is a nonlinear coefficient 


d** = —3,f(a)f(B)f(y), (119) 


x = WN (118) 


tpt 


where 


ei(ko copes Moa ) = eo(ky ari ma) 





ia) a (ky + ky. - ma) (ke + ki. — Mo ) ~~ pe ys (120) 
If we express d*°” as x*x°x7A we have 
3Vi2 
A= —yE3 g(a) 9(B) gy), (121) 


where 
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hy — Moor <= (k, = mya’) 
ee ete. (122) 


is —= Moe + (2 (ky ~~ mya”) —+ k(t + ex) 
1 1 


We note first that if, as seems most reasonable, e, = —e, then g(a) = 
g(8) = g(y) = 1. In any case at optical frequencies g(w) ~ 1 for all 
reasonable values of e./e, and at de 





ke “Se = ky 
(0) = 2 (123) 
ke -++ (2) ky -+F kaa ae ) 
€y C1 
which is also near unity if e. © —e,. Thus, to all intents 
Am — Sos (124) 


which is exactly the result obtained by neglecting the ionic motion. 

Thus, A is an intrinsic electronic property and the effect of ionic 
motion is entirely contained in its effect on x. We note, however, that 
in some ferro-electrics, where the departure from inversion symmetry 
is both small and temperature dependent, A will also be temperature 
dependent. 

If k,, ki2, and k, are all positive, the de susceptibility is greater 
than the low frequency limit of x° but not dramatically so. There is, 
however, no reason why one of these constants should not be negative. 
Negative compliances are familiar in classical mechanics, a well-known 
example is the common automatic door stop which exhibits a positive 
compliance as the door is first opened but a negative compliance when 
the door is almost fully open. The force between atoms as a whole in a 
lattice exhibits a positive compliance but if we separate this force into 
nuclear-nuclear and electron-electron repulsion and nuclear-electron 
attraction, it is quite reasonable to assume that at the equilibrium 
distance the latter component has a negative compliance. 

It is immaterial which term in (113) we take as negative although 
on physical grounds it seems most suitable to take k, and this is also 
a convenient choice. 

Provided that 


Kike + kyok, + Risky > 0 (125) 
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or 


hry + heys 


n= hi Elis <1 (126) 


the system remains in stable equilibrium at x, = a = 0. 
The natural resonance w, and we, of the system satisfy 
mmwiws = kiko + kiki + kokis (127) 


and so as 7 — 1 one of these frequencies —0. At the same time the de 
susceptibility (for simplicity we take —e, = e, = e) 











1 Kay 

ie amma be oes Wage ya 

0 __ Ne ie + ies (128) 
Kye L239) 


becomes infinite, while the low-frequency limit of the optical suscepti- 
bility remains finite. 

If » exceeds unity there is a spontaneous polarization limited only 
by terms such as wx} which we have failed to include in ¢. 

All this is reminiscent of ferro-electric behavior if » is temperature 
dependent and the Curie point corresponds to y = 1. 

The inclusion of a term wx; in ¢ will, in fact, make 7 temperature 
dependent, for the effect of this term is to replace k. by an effective value 
for low-frequency displacements 


kk & ke + 6wx? = ko(1 + XT), (129) 


where 2? is the mean square thermal displacement. As a result if 7, is 
the value at 7 = 0 we have 


i 
= gil — AT _—#-) 130 
nent ne ae 


and so 
ky ( Kye '— 2) 
i NO a ne = 
an MT — T.) 


if we define 7’, as the temperature at which y = 1. This is of course a 
crude approximation to a Curie-Weiss Law. 

By ascribing all the temperature dependence to changes in k,, it 
is obvious from (117) that X° is temperature independent. For 7 to 
be equal to unity, it is not necessary for —k, to be of the same magnitude 
as ky. , all we require [see (126)] is that —k, be near k, . Thus, from (117), 


(131) 
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we do not expect any very anomalous values of x° in ferro-electrics, 
except in so far as materials with a high electronic polarizability are 
more likely to be ferro-electric. 

We have now shown that it is possible to incorporate in our model 
features which lead to quite different behavior for the optical and de 
dielectric constants without either invalidating Miller’s rule or even 
changing the value of A which is essentially a purely electronic property. 

We should, therefore, expect the temperature variation of D%A” to 
correspond to that of x,<x,x,, and this is well borne out by the measure- 
ments of Zwicker and Scherrer’ of the electro-optic coefficients and 
Bass, Franken, and Ward’ of the optical rectification coefficients in 
the dihydrogen phosphates. Both coefficients are directly proportional 
to the de dielectric constant which obeys a Curie Weiss Law. 

In KDP there is almost no change of the s.h.g. coefficient (Van de 
Ziel and Bloembergen*’) with temperature above or below the Curie 
point, in accord with our expectations, but at the Curie point there is 
a small discontinuous change. In an orthorhombic coordinate system 
d2* and dZ% are equal above 7’, but below T,, d,4 increases and d;,3 
decreases while at the same time x,; — x33 decreases and x22 — X33 
increases. With a constant A this is not compatible with Miller’s rule. 

However, at the transition there is a change in crystal class in which 
a, increases and a, decreases, Jona and Shirane.”” It is not unreasonable 
to assume that this increases 73,, and decreases 7'32.. by more than 
enough to compensate for the changes in x,, and x22. 


VI. MILLER’S RULE 


The classical anharmonic oscillator model, which we have shown to 
be a good approximation to the behavior of a real system, leads directly 
to that part of Miller’s rule which refers to the geometric properties 
and frequency dependence of the nonlinear coefficients in a single 
material. We have also in (51) advanced a crude argument to show that 
A will not vary much from material to material. 

When we examine the experimental data we shall see that the allowed 
components of A are between 1 X 107° and 6 X 107° esu for most mate- 
rials but that there are a few materials with significantly higher values 
and a number with values as low as 0.1 X 10° esu. 

In most cases, these exceptional values have a rather simple explana- 
tion and we have therefore to explain a constancy of A to within a 
factor of about 10. 

Neglecting the effects of lattice polarization and local field corrections, 
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which we have shown are irrelevant, the results for the classical an- 
harmonic oscillator model are, from (50) and (56), 


T 3x : 


em, (132) 


Air — 3 


This is also the result from the static perturbation treatment of Section 
IT. 








If we use 
= wy an ©) 
x = 4N aes gN ar, (133) 
we arrive at 
_ 243 do Vin. 
Ai, = 16 e N27?) (134) 
Now the volume occupied by the oscillator is both 1/N and 8 ¢r’)' and 
SO 
4/ 23 T 5% 
Aijr wd, 10 (r » (?)3 esu (135) 


where we have inserted numerical values for a, and e. This expresses 
A;;, as the product of a scale factor (r’)? and a dimensionless shape 
factor T’/r’. Whether we assign to each oscillator the volume per valence 
electron, per atom or per group of atoms, (r’)? is likely to be between 
0.75 and 3 A; so that A will be sensibly constant near 3 X 10°° esu, if 
the shape factor is of the order of 0.01 to 0.05. We have from Turner, 
Saturno, Hank and Parr’s’’ results for CH, a shape factor of 0.05, 
and so this range of shape factors is not unreasonable. It corresponds 
to a linear distortion 0.02? ~ 25 percent. It is also not unreasonable 
that the distortion should be of this general order, wherever it is al- 
lowed by symmetry. We may speculate that much smaller values of 7'/r° 
would imply very weak interatomic forces and much larger values would 
lead to a structure unstable relative to a more symmetric arrangement. 

Thus, qualitatively, the relative constancy of A reflects relatively 
constant shape factors, although we can hardly claim that this is more 
than a sophisticated form of dimensional analysis. It does, however, 
suggest that A is determined primarily by the geometric properties of 
the molecular and crystal structure. 

Large values of A will occur only when the molecules themselves 
depart considerably from inversion symmetry and are arranged in the 
crystal in such a way that the effects of individual parts of the molecule 
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are additive. Small values of A will occur when sections of the molecule 
have local near inversion symmetry or when their disposition in the 
crystal favors the cancellation of effects from different atomic groupings. 
However the molecules are arranged in the lattice, A will be small if 
the molecules themselves have near inversion symmetry, or consist 
of uncoupled parts with the same property. 

In Tables JI, III, and IV, we present 50 values of A derived from 


TABLE II—SEconNp HARMONIC COEFFICIENTS 


Units of d 10-9 esu Units of A 1076 esu 

























































































Material Class Ay di23 A Ref. 
HMT = N, (CHa). 43m | 1.06 30 «({17 | 
70S 43m |1.06|} 153 | 3.5 3 
ZnS 43m |10.6 146 | 4.5 2 
ZnSe 43m 11.06] 200 | 2.5 3 
ZnSe 43m {10.6 370 | 6.6 2 
ZnTe 43m |}1.06| 660 | 2.9 3 
ZnTe 43m |10.6 440 | 3.6 2 
CdTe 43m |10.6 s00 | 7 9 
GaP 43m | 1.06 525 | 1.3 4 
GaP 43m | 1.06 255 0.6 3 
GaAs 43m {1.06 | 1,500 | 1 4 
GaAs 43m |10.6 | 1,760 | 3.7 2 
InAs 43m |10.6 | 2,000 | 8.2 2 

d321 A 
KH.PO, 42m | 1.06 3 |3.6 313.6 5 
KD2PO, 42m | 1.06 2.7/3.2] 2.7/3.2 5 
KH2AsOs 42m | 1.06 3.4/2.6 | 3.2/2.5 4 
NH.H2PO, 42m | 1.06 2.913.15| 3. 138.15 5 
d333 A dsu A dii3 A 
ZnO 6mm | 1.06 43 /8.3 | 138 ]1.1 14] 1.1 4 
ZnS 6mm | 1.06 84 {1.9 3 
ZnsS 6mm /|10.6 180 |4.9 | 90 /|2.45} 102] 2.7 2 
Cd8 6mm | 1.06 186 /8.2 | 96 |1.6 |] 105] 1.8 2 
Cds 6mm |10.6 210 {5.4 |126 j3.3 | 188} 3.6 Z 
CdSe 6mm | 1.06 500 [3.4 3 
CdSe 6mm {10.6 260 /4.8 |186 /|2.4 |} 148) 2.6 2 
BaTiO; 4mm | 1.06 42 |1.0 {111 |2.45] 105) 2.385 | 5 
d222 A 
LiNbO; 3m | 1.06 250 |9 36 «(|1.1 19| 0.55 | 6 
LiNbO; 3m | 1.152 32 /1.05) 15) 0.45; 6 
ut A 

S10. 32 1.06 2.5)1.9 4 
AIPO, 32 1.06 2.5)2.2 4 
Se 32 (10.6 380 {2.1 2 
Te 32 110.6 [25,400 [4.3 7 
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TABLE IJ] —OptricaL RECTIFICATION COEFFICIENTS 


Units of d 1079 esu Units of A 10-6 esu 


—— SS — | 





—— 




















Material | Class a dius | A ‘Ref. 
ZnTe 43m 0.694 3650 | 14 8 
1.06 1040 5 8 
d301 A 
KHePO, 42m 0.694 50 | 3.2 8 
KD.PO,; 42m 0.694 105 | 2.9 8 
d333 A ds311 A 
Cds 6mm 0.694 700 ri 900 | 9 S 


published s.h.g. data, 7 values from optical rectification data and 50 
from electro-optic data. Definitions and conventions are discussed in 
the appendix and a separate list of references is given for the data in 
the appendix. Probable errors vary from measurement to measurement. 
It is probably safe to say that no measurement has an accuracy better 
than +10 percent and in many cases the probable error is greater. That 
for the s.h.g. data at 10.6 u is 30 percent and except for ADP the rectifica- 
tion data is only good to a factor of 3. In addition, a few materials have 
discordant results reported by different groups and this suggests that, 
especially in the case of crystals which are difficult to grow, the data 
should be regarded rather critically. In the case of CuCl, Sterzer, 
Blattner and Miniter’? have constructed a modulator whose behavior 
is consistent with the higher value of the electro-optic coefficient. This 
casts some doubt on the low value for CuBr reported in conjunction 
with CuCl. In the case of the linear e-o coefficient in HMT, Heilmeyer’s™ 
value dios = 32 X 10°° esu is the most recent and reliable. 

The average of all the s.h.g. data is A = 3.3 X 10°° esu, and only 
two coefficients d,.. in HMT and dz33 in LiNbO; exceed 6 X 107° esu 
by more than the probable error. One coefficient dj... in LiNbO is 
unambiguously less than 1 X 107° esu. The sole accurate rectification 
coefficient, Ajo; in NH,HPO,, is 3 X 107° esu which is remarkably 
close to the values 3.15 and 3.2 X 10°° esu for s.h.g. and the electro- 
optic effect. 

Whereas the s.h.g. data have a rather compact distribution about 
3 X 10°° esu the linear e-o data are more straggled. The mean value 
is 2.3 * 10°° esu but there are a considerable number of materials 
with A < 1 X 10°° esu. The difference between the averages A,,, and 
A,., i8 not due to the different materials in the two lists, it persists if we 
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TABLE LV — ELECTRO-OPTIC COEFFICIENTS 





Units of d 107% esu Units of A 1076 esu 





















































































































































Material Class | di23 A Ref. 
HMT = Na(CH2)s6| 43m |0.5 | 32 14 10 
HMT = Na(CHo)e| 43m 6 2.3 11 
HMT = Na(CHe)s| 48m 55 21 12 
Bis(GeOa)3 43m 22 0.8 assumed ¢« = 6 13 
Sodalite 43m 9.5 1.8 14 
CuCl 43m 28 0.75 16 
CuCl 43m 110 3 15 
CuBr 43m 22 0.4 assumed « = 10 16 
ZnS 43m |0.65} 74 0.9 li 
ZnSe 13m |0.55| 120 0.8 18 
ZnTe 43m |0.60| 440 1.5 19 
GaP 43m |0.63] 150 0.3 20 
GaAs 43m |1.02] 215 0.3 21 
NaClO3 23 0.59] 2.5 | 06 22 
KeMeg2(8O.)s 23 <.26; <0.1 assumed « = 6 23 
(NH4)2Mn2(SO4)s 23 4.3 0.5 assumed e€ = Y 24 
eeacermeny,| | | it) 8 : 

aVOo2 3 3 23 a 1.31] a 3s 20 
NasSbSi:9H:0 23 10 g yassumned ¢ = 6 26 
‘Tren Chloride 23 9.5 a7 27 
i | si i a i a eee i a ns 

1333 A ding A 
ZnO 6mm |0.63; 50 1.5 26 (0.8 )ais 28 
Zns 6mm 10.63) 67 0.9 34 |U.45 +— neg. 28 
CdS 6mm |0.63] 110 12] 48 |o.55f4s3 28 

Material Class | A 123 A | aie) A Ref. 
KH2POs “Jam |0.55| +65 4.0 | —50 |1.7 constant stress 29 
IWH2PO. 42m 60 aad constant strain 30 
KD2POs 42m 160 4.0 constant stress 31 
KHeAsOa 42m 77 3.7 84 1.7 constant stress 32 
RbHeAsO, 42m 92 3.5 constant stress 32 
NH4HePO, 42m +55 4.4 | ~—146 |3.4 constant stress 32,29 
NHaHe2PO, 42m 36 3.9 constant strain 

7” a i - 113 
333 A d3 A din3 A) = 
333 
BaTiOs 4mm |0.63/1000 | 1 ae 320 —jo.3] + | 33 
3.3 *104/1 .4 constant strain 34 
6.6 X104]1 .9 constant stress 35 
=: = dis 
333 A dait A dirs A d222 A|— 
333 
LiNbOs | 3m 0.63 860 4.3 340 12.5 | 280 | 1.2 110 | 0.3) + 36 
di A 
SiOz 32 los | 32/09 © 22 
K2S20¢ 32 10.55 1.4] 0.4 11 
SrS206° H20 32 (0.55 0.65) 0.15 eee: 11 
CeHiOsNaBr: H2O0 32 {0.55 0.65| 09.15 (25sttined € =v 11 
CsC4H:0s | 32 10.55 7 1.4 11 7 
d312 A 
| q | 9.7/7 37 


C(CH:0H)4 
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eliminate all materials not common to both lists and may therefore, be 
either a real effect or a systematic error. 

A few materials [e.g., SrS.0,-H.0, Cs;H,,0;NaBr-H,O and K.Mg, 
(SO,)3] have very low values of A. The latter is especially interesting 
since the isomorphous (NH,).Cd.(SO.)3 and (NH,).Mn,.(SOx,)3 salts 
have somewhat larger values. The ammonium cadmium salt 1s known 
to be ferro-electric at very low temperatures and the ammonium man- 
ganese salt is also suspected of ferro-electricity (Jona and Shirane’*). 
More significant, perhaps, is the fact that the divalent ions have very 
nearly regular octahedral coordination (Zemann and Zemann’*’) and 
so form a unit with near inversion symmetry and contribute little to d. 
The main contribution comes from the monovalent ions and their 
irregularly placed neighbors. The difference between the potassium and 
ammonium salts would then be due to the difference in the polariza- 
bility of the two ions. For K” the refractivity is 2.45 ccs and for NH4 
it is 4.05 ces (see Le Fevre’’). If this enters d as a cube the expected 
ratio of the d coefficients would be 4.5. The observed value is greater 
than about 5. Note also that NH{ itself lacks inversion symmetry. 

The tabulated values of A show that Miller’s rule is an excellent rough 
guide to the probable value of d. If the component is allowed by sym- 
metry 


| ashy | 23 & 10 CNA XG, XG, esu. (136) 


However, the rule by itself is not infallible. Occasionally, a value of 
d much higher than that predicted by (136) will occur. In some cases, 
(e.g., d333 in LiNbO3) this is accompanied by a very low value of another 
coefficient and it is then plausible to assume that this is due to a partic- 
ularly critical geometric configuration. In other cases, (e.g., HMT) it 
is quite clearly due to the coincidence of a number of favorable factors. 
The atoms, the molecule and the crystal all have the same symmetry 
and moreover, as we saw in Section 1, all the separate contributions to 
d have the same sign. Thus, it is likely that the value of A ~ 15 X 10° 
esu for HMT represents something of an upper limit to what is possible. 

More often (186) will overestimate d. This is especially hkely to 
occur if the molecules themselves, or large sections of the molecule have 
near inversion symmetry, but it may also occur if the crystal structure 
itself departs only very slightly from a centro-symmetric structure. 


VII. CONCLUSION 


If reasonably good ground state wave functions are available, the 
direct perturbation method of Section II seems most suitable as a 
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basis for calculating the magnitudes of the coefficients. It gives the 
intrinsic nonlinear coefficient 
38x" T esx 
Sep eee EE 13 
in = (137) 
in terms of the intrinsic low-frequency limit of the optical suscepti- 
bility x* and a cubic moment in the ground state. If the electrons are 
uncorrelated, this can be replaced by 
3x” 
eo 7 best (188) 
and t,;;, can be obtained from the charge distribution. From (138) we 
obtain the reduced tensor 
3 beck 
ace (x°)” 








Asin, = (139) 
and we can then incorporate this directly in Miller’s rule using the 
observed susceptibilities x,4, etc. to obtain dfh7 . 

This continuation of the basic perturbation calculation with Miller’s 
rule appears to be the most straight-forward approach to the coefficients. 
Apart from the cubic moment ¢,;;, it involves only experimental 
quantities. 

The analogy with the classical anharmonic oscillator established in 
Section IV seems most likely to be fruitful in qualitative discussions 
of the general behavior of the coefficients. It appears to have both 
empirical and theoretical justification. 

Obviously, on this basis further generalizations of Muiller’s rule are 
possible. For example, we might expect the fourth rank tensor d75%° 
which describes induced second harmonic generation, the Kerr effect, 
etc. to satisfy a relation of the form 


Qi SX eG aii (140) 


A calculation based on fourth-order perturbation theory and a lavish 
use of sum rules leads to 


Asin 1.5 X 10° =$ (r 23 ° ae (141) 


where (r’) is the mean square radius of the charge distribution and 
Q;.. 18 the semi-invariant 


O sini = (0:0 ;0,201) ai 2(x 0,242) (142) 


if we assume that all odd moments vanish. 
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° e ° 
If we take (r’)? as 1 A this gives 
Ave 3 xX 10° ase esu. (143) 


We have seen that in the lower-order processes 7'/r* is of the order of 
2 X 10°. This does not imply that Q/r* is of the order (2 X 107*)*? ~ 
5 X 10°° for whereas T is nonzero only because of asymmetric molecular 
and intermolecular forces, Q is nonzero even for free atoms or ions. For 
example, in the hydrogen atom Q,;;;/(r°)’ = 5/18 and Q;,;;/(r°)’ = 
1/9 so that we expect A to be of the order of 3 X 10°"’ to 10°” esu. 
For calcite with x ptiecay = 0.1 and xa, = 0.55 this gives a value of d 
between 3 X 10°" and 10°” esu. Bjorkholm and Siegman*’ have meas- 
ured 3 X 10°“ esu. 

We have seen that the reduced tensor A,;, is proportional to the cubic 
moment, 


T six <a (£,£;4,) 


and it is therefore clearly symmetric in all its indices. This is in agree- 
ment with Kleinman’s’ hypothesis and follows from the origin of the 
nonlinear behavior in the electronic motion. 

Finally, we remark that nothing increases d like large values of the 
linear susceptibilities yet, although, the values of most allowed reduced 
tensor components A,;;, are near 3 X 10~° esu they can vary by a factor 
100:1. The molecular geometry will often indicate which end of the 
range is likely to apply to a particular material. 
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APPENDIX 


We have throughout adopted a notation, originally introduced by 
Bloembergen*” and his colleagues, in which two fields with complex 
time dependence H%e*** and E%e'’' produce a polarization P%e'*' at 
the algebraic sum frequency a = 8 + y according to 


P%e'*' a Con hig ee, (144) 
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If the actual fields vary as cos w,¢ and cos wet there will be terms in 
P at w, + w. and w, — w, obtained from (144) by letting 8 = a, , a, , 
etc. where @, = —,. 

This notation has several advantages in theoretical calculations for 
much the same reason that the use of complex numbers simplifies ac 
circuit theory, and for much the same reason it has a number of dis- 
advantages in calculating numerical values. For this reason, it has not 
gained general acceptance by experimentalists who tend to use a number 
of different notations, some of which, especially in electro-optics, are 
of respectable antiquity. The difference between the two notations 
introduces various factors of 2. These are independent of the general 
reluctance of physicists to state unequivocally whether they are using 
peak or rms fields. In particularly fertile ground, these various factors 
can luxuriate and blossom as factors of 8 in the final answer. We use 
peak fields in all our definitions. 

If the applied field is 


F() = (0, FP. cos wl, F’; cos Xt), (145) 
‘it has components #2 = E* = 1F,, B® = E° = 4F, and the 1 com- 


ponent of P(é) is 
Pit) = 4{drgh.Pye*?! + cc} + t{dinksF.e\°** + ec} 
a re mre ae cc} a 1 {dise ial =f ae (146) 


where we have used d%23 = d%? and suppressed the first superscript, 
which 1s always the algebraic sum of the second and third superscripts. 
We can also write (146) as 


P,(t) = 4{dtos + diss} FoF cos e + QO)t 
+ 4fdt3 + dis}F2Fs cos (m — Ot. (147) 
If w = Q, this gives 
Pit) = 3fdies*® + diss’ } Fol’; cos Qué + g{diss’ + dige’ } Fos . (148) 
Now the usual experimental definition would be 
Pi) = (dig, + dizz)FoF's cos 2wt + (dtog + d)g2) FoF’; (149) 
and so we see that 
sh.g. dif, = 4d79°° (150) 
rectification d{,, = 4473. (151) 
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If we let Q = O, we have 


P,(t) = iss’ + diss’) FFs cos wt. (152) 
The experimental definition reads 
P(t) = dyo3F2F'; cos wt (153) 
1.€., 
6x12 = Aish’, , (154) 


so that it is possible to contract the last two suffices according to the 
scheme 


11> 1 22-52 3853, 32=28—4, 31=18—5, 12=21-—6. 
(155) 


Thus, d*4, (¢ = 1 --- 8, p = 1---> 6) represents d;. and, for example, 
dos = do3, = dais. 


P, cos 2wt 


(di23 + d332)F'oFs cos Qut 
2d?3,F.F cos 2uwt = d°.4-2F.F cos 2wt. (156) 
It is therefore common to define the “vector” § 

F =F, ,F.,53,5,,55,55 = Fi, Fo, F3,2F.F3 , 2F,F1 ,2F,F. (157) 
so that 


| 


6 
P; = 2 dipSp ° (158) 
p=l 
With this notation d°4 = dij = df ~ djs + d28 but also ds = dj. 
In the electro-optic case, the subscripts referring to optical fields 
can be contracted 


2d? = d?, = de, p=l---6, k=18. (159) 


The alternative ordering (155) leads to df, .. 
Note that in this case, since d operates on two distinct fields, one 
optical the other dc, there is no possibility of constructing a ‘‘vector”’ 


w 


such as §. The sum implied in the definition of d,;, is 
3 
bX = 8X4 = D2, GREY. (160) 
k=1 


Electro-optic data are often presented as coefficients r,, in the sus- 
ceptibility ellipsoid. If n is the refractive index (assumed isotropic), 
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ye: pend _ At pe = dr <= _8r w w 0 161) 
ok = n'! pk 9! ijk n' ifk - (1 


The dimensions of d and r are those of an inverse field. 

In the MKS system, the units are meters per volt, in the cgs system 
they are centimeters per stat-volt. One MKS unit is 3 X 10° esu and 
so numerical values of d in esu are the larger numbers. 

We have not discussed the influence of a mixed use of rms and peak 
fields but we note that if rms fields are used throughout the values of 
the coefficients will all be v2 times larger than if peak fields are used 
throughout. No one is, however, likely to use an rms de field. 

Experimental values of the electro-optic coefficients are usually 
expressed in absolute units and the only ambiguity that can occur is 
associated with whether the measurements were made at constant stress 
(unclamped) or constant strain (clamped). It is safe to assume that 
constant stress is implied by the absence of any definite statement to 
the contrary. 

Second harmonic coefficients are sometimes given in absolute units 
but more often relative to the coefficient d;., in KH,PO,. An absolute 
measurement of this by Ashkin, Boyd, and Diedzic®” gave 


2 1,72 —9 ; 
so = oe, == OX 10 es; 


but this is now believed to be too large. The most recent measurements, 
Francois’, Bjorkholm,”’ give 


321 = $daor” = 1.88 X 10°’ esu + 12 percent 


for the coefficient in NH,H,PO, . Relative measurements show that it is 
identical in KDP and ADP. We have used a rounded off, compromise 
value 


KDP 22, = 4022°° = 1.5 X 107° esu (162) 


in compiling the tables. It affects all values of d°* at optical frequencies 
but not at 10.6 pu. 

It will be apparent that in comparing theory or experiment with 
experiment, considerable care is needed to be sure that like definitions 
are being compared with like. 
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A High-Capacity Digital Light Deflector 
Using Wollaston Prisms 


By W. J. TABOR 


(Manuscript received December 27, 1963) 


A high-capacity digital light deflector (DLD) using Wollaston prisms 
as the passive elements 1s described. It 1s shown that, for a 1-cm aperture, 
approximately 4(10)° resolvable positions with a crosstalk ratio of 17 dB 
are theoretically possible. A manually-operated model was constructed 
that gave 4(10)° resolvable positions with a crosstalk ratio of 20 to 28 dB. 
The output positions of the model showed resolution approximately equal 
to that set by diffraction theory. 

The problems associated with «imperfect modulators are discussed and 
the characteristics of three different schemes of operation are calculated. 
Results from experiments with one such scheme, the reflection mode of 
operation, are given. They compare favorably with the calculations. 


I, INTRODUCTION 


A digital light deflector (DLD) is a device that can switch a light 
beam to a number of distinguishable positions and has been previously 
described by a number of authors.+-* Such a device can be made from 
a number of modulators and passive deflectors. The modulator, for 
this application, 1s one that is capable of switching the sense of 
polarization, and the deflector unit is a passive element which has 
different optical paths corresponding to the two senses of polarization. 
A basic unit of a DLD is shown in Fig. 1. It has been previously 











PASSIVE 
DEFLECTOR 


MODULATOR 


Fig. 1— Basic unit in a digital light deflector. 
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shown’ that such units in series can generate 2” distinguishable 
positions. 

A modulator can be made from any material which can become 
birefringent with the application of an external signal. A minimum of 
7 retardation is needed in order to switch the sense of polarization. 
Examples of modulators that have been considered for this applica- 
tion are Kerr cells,® stressed plate shutters,° and crystals such as 
KDP? and KTN7® which exhibit an electro-optic effect. The Kerr 
cells, although very fast, cannot be used at high repetition rates be- 
cause of heating difficulties. Stressed plate shutters, since they depend 
on mechanical strain, are limited to lower frequencies. The most at- 
tractive modulator materials are the electro-optic crystals and are the 
ones that are being seriously considered for the DLD. 

The passive deflectors that have been considered for this application 
are uniformly thick sections of properly oriented uniaxial crystals 
such as calcite,2.? prisms of the same materials) and Wollaston 
prisms. The uniformly thick pieces of calcite are used with con- 
verging light for the maximum number of resolvable positions® *? but 
with such use suffer from aberrations that are caused by the variation 
of angle in the converging beam. A converging beam of light passing 
through a thick piece of calcite oriented for a displaced beam shows 
aberrations which for the most part appear like astigmatism. Prisms, 
when used with plane waves, can deviate the angle of the plane wave 
without distortion, and therefore, a DLD using prisms can give results 
that are limited only by diffraction theory. A DLD using prisms 
also uses much less birefringent material than one based on uniformly 
thick pieces of the same material. A disadvantage of simple prisms is 
that the difference in angle between the two oppositely polarized 
beams is only a small variation superimposed on the much larger 
normal type prism deflection. This difficulty can be minimized if the 
prisms are immersed in an oil whose index of refraction is near that of 
the prism. Wollaston prisms do not have the deflection associated with 
simple prisms—instead, the only deviation is that between the two 
oppositely polarized beams—and are therefore well suited for DLD 
use. 

In this paper, the design of a high-capacity DLD using Wollaston 
prisms will be discussed along with experimental results which will 
show that this system does lead to densities limited primarily by 
diffraction effects. The problem of an imperfect modulator is also 
discussed, and calculations on several systems are given which relate 
the signal to background ratio to the modulation efficiency. 
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II. CAPACITY OF A DLD 


Since a Wollaston prism is used to deflect a beam in angle, it is 
convenient to think of the operation of a DLD in terms of angular 
space. Later it will be convenient to place a lens after the DLD 
which will focus the beams of light, each with a different angle with 
respect to the axis of the lens, into corresponding points on the image 
plane. 

The capacity of a DLD is determined by the values of two angles. 
One is the largest angle that is allowed in the system and the other 
is the minimum angular separation between adjacent positions. The 
capacity of the DLD is then just the square of this ratio. The 
minimum value is determined by either diffraction effects or imperfec- 
tions in the optical system, and the largest value is determined by the 
maximum angular aperture of the system. First, let us consider the 
lower limit set only by diffraction theory. The light emerging from a 
circular aperture illuminated by a uniformly intense plane wave will 
have a spread of angles that is caused by diffraction. The intensity of 
the light as a function of angle is given by the well-known Airy func- 
tion (Fig. 2).° The smallest deflection angle in a DLD must be suffi- 
ciently large so that the deflected beam must be resolved from the 
undeflected one. If we set the criterion that the two beams should be 
separated in angle such that in the far field the first dark ring of 
each beam overlap, then the minimum angle is given by 2.44 »/D 
where A is the wavelength of light and D is the diameter of the circular 
aperture (Fig. 2). 

It is possible to estimate the crosstalk, e.g., the ratio of light within 
the first dark ring to the light within a circle of equal size as the first 
dark ring but displaced, by examining the Airy function. The light 
within the first dark ring contains 84 percent of the total energy,® and 
a ring displaced by one diameter (corresponding to a separation of the 
two directions of 2.44 A/D) falls within an annulus of 1.22 A/D to 3.66 
A/D which contains 10.6 percent of the total energy.® Since a circle 
can be surrounded by six circles of the same diameter, this displaced 
ring contains somewhat less than 3 (10.6 percent) = 1.77 percent. 
The crosstalk ratio 1s then 84/1.77 = 47 or 16.7 dB. For a separation 
between the beams of twice the above, i.e., 4.88 A/D, the crosstalk 
ratio can be estimated to be 210 or 23.2 dB. 

If, in addition to diffraction effects, the wavefront is distorted further 
by some aberration, then the focused spot size will increase and thereby 
decrease the capacity of the DLD. The amount of wavefront distortion 
depends somewhat on the type of aberration, but for wavefront distor- 
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tion of \/4 or less the increase in the focal spot size is not very significant.”° 
The aberrations in the DLD will result from inhomogeneities in the 
material and from poorly worked surfaces. It should be emphasized 
that the value of \/4 is the maximum variation allowed after passing 
through the entire DLD, and therefore, the maximum variation for 
any individual unit is much less. For a DLD with 10° resolvable posi- 
tions, the total number of units would be 20, i.e., 2°° is approximately 
10°; and therefore the maximum wavefront error in any unit should 
be less than \/4+/20 & /20 where by using the square root we have 
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Fig. 2— Far-field diffraction at a circular aperture (the Airy pattern). 
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assumed random irregularities. The requirement that a modulator 
have an optical distortion of less than /20, not allowing for any im- 
perfections in the remainder of the optics, is extremely difficult and will 
probably represent a serious problem for some time to come. The high 
requirements placed on individual components is a direct result of the 
large number of such elements that must be placed in series for the 
complete DLD. 

The maximum angular aperture of a DLD is limited by a number 
of effects: (7) the response of the Wollaston prisms, (2) the walk-off 
of the beam as it is deflected to larger and larger angles, (22) the 
angular aperture of the modulators, and (iv) the angular aperture of 
the output lens. These limitations will now be considered in more 
detail. 

The deviation angle of a Wollaston prism is not constant but is a 
function of the incident angle (see Appendix A) and at some angle 
the deviation will vary sufficiently such that the array of angles is no 
longer uniformly spaced. Calculations based on the equations in 
Appendix A indicate that if the Wollaston prism with the smallest 
deviation is placed first in the DLD and the next largest second, etc., 
for a total of 20 stages and a maximum angle of deviation of 8°, the 
array 18 uniformly spaced to within 10 percent. 

As the beam traverses through the DLD, the deflection angle can 
become larger and larger and unless the apertures of the prisms and 
modulators are very large, the beam will eventually strike the sides of 
the apparatus. It is clear that the Wollaston prism with the largest 
deviation angle should be placed last in the DLD in order to minimize 
the spreading of the beams. With this arrangement approximately 14 
of the beam is intercepted by the apparatus for a 20-stage deflector 
with a maximum deviation angle of 8°. This latter figure is not a 
result applicable in all cases because it depends on the specific lengths 
of the elements in a DLD. 

The relative retardation in a modulator is also a function of the 
incident angle. How rapidly this function varies depends on the type 
of modulator. In KDP and similar crystals the angular aperture is 
very small (less than 1 minute of are for 30 dB extinction between 
crossed polarizers for a crystal thickness of 0.089 inches!) since these 
crystals are uniaxial, with a large birefringence, and will, therefore, 
give large relative retardation for even small angles away from the 
crystal axis. Techniques are reported that compensate the birefrin- 
gence?!2 in KDP but the degree of success is not made clear. Because 
of the limited angular aperture, crystals in the class with KDP are 
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not considered attractive for the DLD. In CuCl and other crystals in 
this class, the angular aperture can be very large because they are 
cubic crystals in the field-free case and are therefore optically iso- 
tropic. Angular apertures of +25° have been reported'* for this 
material. KTN is also a cubic crystal, but the electro-optic effect in 
this crystal is quadratic in contrast to the linear effect in most other 
materials useful as modulators. Therefore, KTN is usually biased by a 
de voltage in order to reduce the value of the modulation voltage, and 
this bias reduces the angular aperture of the modulator; however, as 
shown in Appendix B, the angular aperture can still be +10° for 
reasonable bias fields. 

The lens at the output of the DLD will focus the beams to points 
on the image plane. If this lens is not perfect, the spots will be larger 
than that calculated by diffraction theory, and the capacity of the 
DLD will be reduced. Since the choice of lens will depend on the 
application of the DLD, it is not possible to state very precisely what 
the angular aperture could be; however, +10° seems reasonable for 
most applications. 

The calculations on the capacity of the DLD have been based on a 
plane wave of uniform intensity resulting in an Airy pattern in the 
far field. By placing a filter in such a beam which attenuates the light 
as a function of the radial distance, it is possible to greatly reduce the 
energy in the rings at an expense of slightly increasing the size of the 
central disk.1* If such a filter could be effectively incorporated in the 
DLD, the crosstalk between resolvable positions could be greatly 
reduced without significantly reducing the overall capacity. 

As an illustrative example, we will calculate the number of re- 
solvable positions assuming that the minimum angle is set by diffrac- 
tion theory which corresponds to 2.44 A/D for 16.7-dB crosstalk and 
to 4.88 4/D for 23.2-dB crosstalk, and that the maximum angle is 
+10°. For a wavelength of 6000A and an aperture of 1 cm, this cor- 
responds to a two-dimensional array of approximately 4(10)® re- 
solvable positions with a crosstalk ratio of 16.7 dB and (10)® positions 
with a crosstalk of 23.2 dB. These two cases imply 22 basic units in the 
first case and 20 units in the latter. If one can learn to use much 
larger angles, then the number of positions will increase; however, on 
the other hand, if components are optically imperfect so that diffrac- 
tion-limited performance is not possible, then these numbers will be 
reduced. 

Thus far the DLD has only been considered in conjunction with a 
diffraction-limited beam; however, images may also be transmitted 
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through the deflector. Since it takes a number of diffraction-limited 
points to make up an image, the capacity of a DLD in terms of images 
will obviously be less. 


III. PERFORMANCE OF A MANUALLY-OPERATED DLD 


At the present time it is not possible to construct a large capacity 
DLD using electro-optic modulators since these materials are not 
available in the quantity and quality required. In order to check the 
performance of this system, it is necessary to replace the electro-optic 
modulators with half-wave plates and thereby replace electronic acti- 
vation with mechanical rotation. 

A system as shown in Fig. 3 was constructed consisting of 18 mica 
half-wave plates, 7 pair of quartz Wollaston prisms, and 2 pair made 
from calcite. The aperture of the system was 18 mm and the wave- 
length was 6328 A. The smallest deviation angle in the system was 1 
minute and the largest was 4°; the smallest deflection angle cor- 
responds to 8.38 A/D, which is a separation somewhat larger than that 
considered earlier in this paper. The aperture of the pinhole was (0.001 
inches which is larger, by a factor of approximately 4, than that re- 
quired to give a diffraction-limited divergence to the wave emerging 
from lens 1. This system when used with an aperture of this size 
should be considered to be deflecting an image rather than operating 
with a diffraction-limited beam. The purpose of using a spot of this 
size is that the ratio of light in the central disk to that in the diffrac- 
tion rings is much higher than when a diffraction-limited beam is 
used, hence the crosstalk between adjacent positions should decrease 
shen compared to the diffraction-limited case. 

Fig. 4 is a picture of a focal plane taken with this apparatus. It 
shows the 218 214(10)® resolvable positions. This picture was taken 
by setting each half-wave plate in the halfway position so that light 
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Fig. 3 — Arrangement of elements for a high capacity DLD. 
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Fig. 4— 218 ~ 3(10)6 resolvable positions of the experimental apparatus. 


was divided equally into both polarizations. In this way all 2** po- 
sitions are simultaneously illuminated. Fig. 5 is an enlargement of an 
arbitrarily-selected subsection of Fig. 4 and shows the resolution 
much more clearly. 

Fig. 6 is an enlargement of a single position taken under two cases: 
(1) the pattern on the top illustrates the focused beam with the DLD 
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removed from the system, and (72) the pattern on the bottom is the 
same focused beam with the DLD in the system. The degradation of 
the pattern on the bottom is a result of the optical imperfections in 
the many elements that make up the DLD. A comparison of the two 
patterns shows that the DLD did not increase the size of the central 
disk by an appreciable factor but did make it much more irregular. 
Fig. 6 was overexposed in order to show the weaker diffraction rings 
much more clearly. 

The crosstalk ratio between adjacent positions was measured by 
first setting the DLD so that only one position was present at the 
focal plane. A 0.001-inch aperture was placed at the focal point, ad- 
justed in position for maximum light transmission, and the amount 
of light was measured by a detector. The aperture was then moved 
to an adjacent position, and the amount of light passing through 
the opening was again measured. The ratio of these two numbers 
is the crosstalk. The measurements were made for various settings 
of the DLD and the values ranged from 20 to 28 dB. The range in the 
measurements is presumably due to imperfections in the optical 
system which cause the focused spot to be irregular and unsymmetric 
in shape. The irregular shape is also evident from Fig. 6. 





Fig. 5 — An enlargement of a section of Fig. 4. 
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Fig. 6— The degradation of the focused beam pattern by the DLD. (The 
pattern on the top was taken without the DLD in the system and that on the 
bottom with the DLD in the system.) 


The performance of this system is probably worse than the A/4 
tolerance discussed previously in the paper but is probably not much 
worse than a wave or so; this latter figure was not measured directly 
but was estimated from diagrams which show spot patterns as a 
function of various aberrations.*°° 


IV. DLD PERFORMANCE RESULTING FROM IMPERFECT MODULATORS 


It is anticipated that an electronically activated modulator will be 
the weak link in DLD performance for some time to come; it therefore 
is important to know how an inefficient modulator will affect the 
performance of a DLD. In this study, we assume that the Wollaston 
prisms in the DLD are perfect and that the modulator can be charac- 
terized by a single term, #, which is defined as 


7 = 2 .. light intensity in the desired polarization _ 
b —— light intensity in the undesired polarization (1) 
with 


a+tb= 1. 
A perfect modulator by this definition has an infinite efficiency. 
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With perfect modulators the image plane would have one bright 
spot at the desired position and the remaining 2”-1 positions would 
be completely dark. With imperfect modulators, and for simplicity 
we assume that they are all imperfect to the same degree, some light 
will fall on every position. The resulting intensity distribution on the 
focal plane has been studied by otherst®?* and is also given in 
Appendix C. It is shown there that the intensities in the focal plane 
of an n unit DLD can be generated by the expansion of a”[1 
+ (1/E)]|" where the first term a” gives the intensity of the desired 
position, the second term n(a"/H) implies n positions with intensity 
a"/E, the third term [n(n-1) /2](a"/H?) implies n(n-1)/2 positions 
of intensity a"/H?, etc. The sum of all the coefficients in the expansion 
of (1 — 1/E)” is equal to 2” so that each position in the focal plane can 
be assigned to one of these terms. 

It can also be established that the polarization of the even powers of 
KE, i.e, a", a"/E?, a"/E*, etc. have the opposite sense of polarization 
from that of the odd powers of £, i.e., a"/E, a"/E?, etc. This result 
can be determined from the basic definition of the efficiency (1). 

The requirement on the modulator efficiency is determined by the 
particular application of the DLD. If the deflector will be used to 
accomplish localized heating or welding, to supply energy for a switch, 
or to be used as a printout or display, then the ratio of the intensity 
at the desired location to that at the next highest position is impor- 
tant. This ratio must be sufficiently large so that the intensity at the 
desired location must be great enough to cause the reaction, and yet 
the intensity at the next brightest position must be less than that to 
cause the reaction. For example, it is possible that for processes that 
depend on heating 10 dB is a sufficient ratio, whereas for a visual 
display 30 dB or greater may be necessary so that the eye will not be 
confused by multiple images. The intensity in the desired position, 
as previously stated, is a” and that in the next highest case is a”/E 
for the opposite polarization and a"/EH? for the same polarization. 
Therefore, this ratio is 1/H when there is no polarization selection and 
1/E? when polarization selection is used. In general, it should always 
be possible to eliminate the opposite polarization so that the first 
troublesome term will be a”/E?. 

The DLD can also be used as a memory device.” In this application 
& memory is placed in the focal plane which is constructed such that 
at each of the focused points an opaque or transparent spot is present. 
This code is suitable for a binary organized memory where, for 
example, the opaque position can represent 0 and the transparent 


968 THE BELL SYSTEM TECHNICAL JOURNAL, MAY-JUNE 1967 


position can represent 1. The nature of the position can be read by 
placing a detector behind the memory and then directing a light beam 
to the desired location. The presence of an opaque position is de- 
termined by no response at the detector, and similarly a transparent 
position will result in a positive response. 

When the modulators are imperfect, the undesired beams of light 
will also strike the memory plane and have some chance of reaching 
the detector with the possibility of causing erroneous results. For 
error-free operation the detector must receive more light when the 
DLD is addressed to a transparent position than when it is addressed 
to an opaque position, and for future reference let us call the ratio of 
these two intensities the signal-to-background ratio, R. The least light 
that can reach the detector when the DLD is addressed to a trans- 
parent position is a”, i.e., the main beam alone, while the most light 
that can reach the detector when the DLD is addressed to an opaque 
position is Its. — a”, ie., all the light except for the main beam. The 
minimum signal-to-background ratio is then 


n 


a 


Rinin = I vs: a 
tot 


(2) 


The Rmin ratio defined by (2) is not an unreasonable minimum in 
that a memory plane can be designed to give ratios very close to the 
values calculated by this equation. 

Three different ways of interrogating the memory will be discussed, 
and for each case calculations will be made for the signal-to-back- 
ground ratios. The first case is where all of the light is allowed to 
strike the memory plane; the second uses a polarization selection 
before the memory so that only light polarized in the same sense as 
the main beam will reach the memory plane; and the third is the 
reflection mode of operation. To prevent confusion the subscripts 1, 2, 
3 will be used on the Ruin ratios for the three cases mentioned. 


4.1 Case 1—All Light From The DLD Allowed to Reach The Memory 
In this case J... = 1 since a + 6 = 1. Therefore, 
a” 1 


(Rmint = [at = we 
re 


on 4+ MoD (1) 4 ee ,) ae 
eae 21 73 sg 





(3) 
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The signal-to-background ratio for this case is plotted as a function of 
modulator efficiency, HZ, for several values of n in Fig. 7. It shows that 
for an nm = 20 DLD with an (Ryin)i ratio of 5 dB, a modulator 
efficiency of 18.6 dB is required. 


4.2 Case 2—Polarization Selection Before The Memory 


If polarization selection is used after the DLD, which would require 
an additional modulator and polarizer, the odd powers of H can be 
cancelled from (2) and the (Ryin)o ratio becomes 


] 


(Lemin)2 = (4) 


This ratio, (Rmin)2, is plotted as a function of efficiency in Fig. 8. It 
shows that for n = 20 and (Rmin)o = 5 dB, a modulator with an 
efficiency of 14.0 dB is required. This case represents an improvement 
of 4.6 dB over the first case. 


-BACKGROUND RATIO, (Ryn), » IN DECIBELS 


MINIMUM SIGNAL-TO 





10.0 12.5 {5.0 17.5 20.0 22.5 25.0 27.5 30.0 
MODULATOR EFFICIENCY, E, IN DECIBELS 


Fig. 7— Minimum detector ratio versus modulator efficiency for a DLD where 
the detector is placed behind the focal plane. 
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4.3 Case 3—feflectton Mode of Operation 

An alternate way" of reading the memory plane is shown in Fig. 9. 
In this case, the light is reflected from a mirror located just behind 
the memory and is redirected through the DLD to be detected 
after passing through a second aperture. The second aperture elimin- 
ates a large part of the background and therefore the ratio, Ruin, for 
the same modulator efficiency is considerably improved. The deriva- 
tion of the Ryin ratio for this reflecting mode of operation, (Rmin)3, is 
given in Appendix D and only the result 1s shown here: 


1 


1 ae — 1) (1) _ 
n( ts) + 9 Be) + 


(Rmin)3 18 plotted as a function of # in Fig. 10. With this mode of 
operation a modulator with an efficiency of only 9.1 dB is required to 
give an (Rmin)g ratio of 5 dB or better. The reflection mode of opera- 
tion therefore represents an improvement, when measured by reduced 
requirements on the modulator, of 9.5 dB over the first case and 4.9 dB 


(Ranind (5) 


is©) 
oO 
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Fig. 8— Minimum detector ratio vs modulator efficiency for a DLD with 
the detector placed behind the focal plane and with polarization selection. 
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Fig. 9— Arrangement of elements for the DLD using the reflection mode of 
operation. 


over the second case. It should be emphasized that the improvement 
ratios given here are strongly dependent on the choice of the Amin 
ratio, which was taken to be 5 dB in this paper. For larger Rin 
ratios the improvement in the HE ratio would be even greater and vice 
versa. 

A disadvantage of the reflection mode is that a low f-number lens 
must be used at the output of the DLD. The reason for this is that 
the light reflected from the plane mirror must enter the output end of 
the DLD, and this requires that the focal plane be approximately 14 
the linear dimension of the aperture of the DLD. One can show that 
this requires a lens of f:1.5 or so if the angular spread of the DLD 1s 
+10°. If a lens is designed to have a spherical focal surface and a 
spherical mirror is used as the reflector, then the lens can have any f 
number. 

Any light reflected from the surfaces in the DLD when used in the 
reflection mode can be prevented from entering the aperture near 
the detector by giving a slight tilt to the elements that make up the 
DLD. It is possible to choose an angle such that no reflection is 
centered on the aperture. 


V. EXPERIMENTS WITH THE MANUALLY-OPERATED DLD EQUIPPED WITH 
POOR MODULATORS 


The experiments that will be described in this section make use of 
the n = 18 manually-operated DLD where the half-wave plates have 
been substituted for the electro-optic modulators. This is the same 
apparatus as used in Section ITT. 
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Fig. 10-—- Minimum detector ratio versus modulator efficiency using the re- 
flection mode of operation. 


Ideal half-wave plates have the property that if the angle between 
the polarization direction and the axis of the half-wave plate is 6 
then the plane of polarization emerging from the plate will be rotated 
by 26 from its original direction. For maximum efficiency, the half- 
wave plates are oriented at 6 = 0 if no switching action is desired and 
at 6 = 45° if the other polarization is desired. The practical maximum 
efficiency for the split mica plates used in this experiment ranged 
from 30 to 40 dB, which is high enough to give almost perfect DLD 
behavior. In order to simulate poor modulators, the wave plates are 
set at 6 = e for the predominantly unswitched case and at 6 = 45° — « 
for the predominantly switched case. The angle « can then be adjusted 
to achieve any degree of modulator efficiency. 

To illustrate the behavior of a DLD under the influence of poor 
modulators, the half-wave plates were set for H = 10 dB, and a 
picture, which is shown in Fig. 11, was taken at the focal plane. This 
picture shows some characteristics which will now be enumerated: 


(1) The desired spot is shown as the brightest point in the upper 
left-hand quadrant and is vertically polarized. 7 
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(72) Some of the 18 points whose intensity is 1/H of the main 
beam are shown in the lower right-hand quadrant and are horizontally 
polarized. 

(122) Some of the 153 points whose intensity is 1/H? of the main 
beam are shown in the upper left-hand quadrant and are vertically 
polarized. 

(iv) Some of the 816 points whose intensity is 1/H? of the main 
beam are shown in the lower right-hand quadrant and are horizontally 
polarized. 





Fig. 11— Focal plane intensity distribution with modulators set at H = 10 dB. 
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(v) The points in the upper left-hand quadrant are vertically 
polarized and those in the lower right-hand quadrant are horizontally 
polarized. A polarization selector in this case would eliminate the 
entire lower side. It will always eliminate the side that is opposite to 
the one that contains the main beam. 

(v1) There are no points with significant intensity (> 1/EH"/?) in 
either the upper-right or lower-left quadrant. 


As all of the focal plane is not shown in this figure, some of the 
background positions are missing in order that an enlargement could 
be presented. In this exposure there is a total of approximately .4.6 
times more light energy in the background than in the main beam. 

The performance of the reflection mode of operation was compared 
to the theoretical calculation by making use of the properly mis- 
oriented half wave plates. For two reasons (5) cannot be directly 
used: (1) In this experiment a memory was used that contained only 
one opaque position and for such a case the signal-to-background ratio 
using this equation is not accurate, and (iw) equation (5) assumes 
that the opaque positions are also perfectly absorbing which is not 
valid ‘for this experiment in that the opaque position reflected 4 
percent of the incident power. A signal-to-background ratio for this 
particular experiment can be calculated as follows. When the DLD is 
addressed to a transparent position the light reaching the mirror is 
Tyo and the light striking the mirror when the DLD is addressed to 
the opaque position is Ij — (1 — T)a" where T is the power reflec- 
tion coefficient. The ratio of these two values is 





ese es eg a 

‘exp/3 ‘fe — (1 = r)a’ 7 Leet = @! eee T)a" ise = (1 os P)a" 

=1+ De ee Aad. | i 
(Rinin)3 a - Cia )s a 


The calculated and experimental values of (R.x»)3 are summarized 
in Table I. The first column lists the modulator efficiency, the second 
column contains the calculated (Rexp)3 and the third column lists the 
measured values. 

The agreement between the calculated and measured quantities 
agree very well and indicate that the behavior of the reflection mode 
of operation is adequately understood. 
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TABLE I 





Signal-to-background ratio for the 
reflection mode experiment, 




















Modulator efficiency (Rerp)s 
dB Calculated in dB Measured in dB 
30 14.0 14 
20 13.8 14 
10 7.0 8 


VI. PARALLELLING THE OUTPUT 


In the DLD thus far discussed, only one memory in the output 
focal plane is used (see Fig. 3), and therefore the memory is read one 
bit at a time. For some applications it may be advantageous to 
parallel the output as shown in Fig. 12 in order to increase the bit 
capacity of the DLD. With this scheme the number of bits read with 
each setting of the DLD is equal to the number of memory planes; 
the memory now corresponds to one with word organization. Fig. 12 
shows four such memory planes but by going to a three-dimensional 
array it is possible to parallel 30 to 40 such planes and still use only 
one output lens providing the maximum angular aperture of the DLD 
is limited to a total angle of 12° or so. If additional lenses are em- 
ployed, then any number of memory planes can be incorporated. 

This scheme of paralleling is directly applicable to cases 1 and 2, 
which were discussed in Section IV, but will not work for the reflection 
mode since there is no way to distinguish between different memory 
planes when the light is redirected through the DLD. In order to 
parallel the output of the reflection mode, three different schemes have 
been devised: (2) to use different wavelengths for each memory plane 
and separate the colors before and after the DLD,” (i) to modulate 
a monochromatic beam at each memory plane with a different fre- 
quency and then to separate the different frequencies after reflection 
through the DLD,?° and (12) to arrange the memory planes to have 
different distances between the output of the DLD and the memory 
plane and to use a short pulse of light; the different planes can now 
be read since each plane will return the pulse to the detector at a 
different time.** 

One difficulty that can arise in the paralleling schemes is that very 
low f-number lenses must be used to refocus the output plane into 
repeated images (Fig. 13). The first lense placed after the DLD can 
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Fig. 12— One method of paralleling the output of the DLD. 


have a reasonable f number since the light beams are still confined to 
the aperture of the DLD. The second lens must have f:1 or so if the 
output of the DLD has a total angle of approximately 12°. Additional 
lenses must have even lower f numbers (Fig. 13). The practical solu- 
tion to this problem is to perform all of the paralleling within the 
first focal length. This scheme will not work with the reflection mode 
where the time of flight varies for each memory plane since one lens 
implies only one distance between it and the various memory planes. 


VII. MEMORY MEDIA 


The problem of reading a memory has been discussed in earlier 
sections of this paper; we will now consider the problem, which is 
again primarily the result of imperfect modulators, of using the DLD 
to write into a memory. Two general types of materials will be 
considered for use as a memory medium. One is a medium where the 
process is linear in terms of total exposure, i.e., the effect on the 
medium of n pulses of light of intensity I/n, each lasting for a time 
AT, is the same for any value of n; and the second is one which has a 
threshold in terms of the light intensity. An example of the first type 
is a photographic film and of the second is a memory based on a 
transparent ferrimagnetic garnet at its compensation temperature.”* 


DIGITAL LIGHT DEFLECTOR 977 


A linear medium has the disadvantage, for this application, in that 
it integrates the light striking its surface. Therefore, when the photo- 
graphic film is being exposed by a DLD with poor modulators, the 
problem of the background light must be considered. This problem 
is different from that considered in Section IV because in that case 
the DLD was set for one address and the question was asked what is 
the light intensity distribution over the whole focal plane. In this 
case, we ask what is the total amount of light energy striking one 
position on the memory plane when the DLD is addressed to all of 
the positions. When the DLD is set for one address, the total exposure 
over the whole plane is a”[1 + (1/E)]"AT where AT is the duration 
of the exposure. This result is evident from the discussion in Section 
IV. When one sits at a position and the DLD is addressed to all 
positions and dwells at each position for the same time AT’, the total 
exposure at that position is also a"[1 + (1/H)]"AT. This latter result 
has been previously published?’ and is also proven in Appendix C. 

The following calculations, which represent worst cases for writing, 
can be performed. The first case that will be considered is the situation 
where all of the light is allowed to strike the memory plane and the 
second case makes use of polarization selection. In both cases it will be 
assumed that all points will be addressed, except for one. 
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Fig. 13-—— Lens requirements for re-imaging the output focal plane of the DLD. 
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7.1 Case 1—All Light from the DLD Allowed to Reach the Memory 


In this case the light striking any of the addressed positions will 
have an exposure of nearly a*(1 + 1/H)"AT and the exposure at the 
one position that was not addressed will be [a"(1 + 1/4)” — a’JAT, 
and the ratio of these two values is 








elit) eta) -e . 
Mt, = 2 — - 4 + 5 
a’(1 + 4) eth a’(1 + 4) eae a’(1 + ;) = a" 
: (7) 
= 1 + (Rinin)1 * 


7.2 Case 2—Polarization Selection Before the Memory 


In the same way as in the first case one can derive 


M,=1+ (Rmin)2 . (8) 

The exposure ratios M, and Mz are plotted as a function of modula- 
tor efficiency in Figs. 14 and 15. These plots can be used to determine 
the minimum modulator efficiency required for a certain exposure 
ratio. 

From the exposure ratio and the properties of the medium, e.g., 
photographic film, the density ratios of the positions can be calculated. 
These two ratios do not have to be the same, as a material such as 
photographic film can be very linear in terms of exposure but the ex- 
posure vs film density can be very nonlinear. 

For a theshold medium the problem is much simpler. The only re- 
quirement is that the most intense beam must be greater than threshold 
and that the next highest position be less than threshold. As men- 
tioned in Section IV, this ratio is 1/H when no polarization selection 
is used and 1/H? when polarization selection is used. 


VIII. ANALOG CONTRASTED WITH DIGITAL DEFLECTION 


The same large number of resolvable positions as described in this 
paper could, in principle, be achieved by means of an analog deflec- 
tion; an example of this is a prism of an electro-optic material with 
electrodes placed on the parallel surfaces. It is necessary to induce an 
increase of 27 retardation along the base of such a prism in order to 
deflect the beam by one resolvable position. Therefore, for a 10° posi- 
tion deflector it is necessary to have 2(10)*z along the X bank and 
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_ Fig. 14— Exposure ratio vs modulator efficiency for a DLD where all of the 
light is allowed to reach the memory plane. 


2(10)*x along the Y bank for a total of 4(10)°2 total retardation. With 
a DLD, a total retardation of 207 can accomplish the same number 
of resolvable positions. Therefore, it is evident that the DLD makes 
very efficient use of the variable retardation. The reason for this ef- 
ficiency is that the DLD makes use of the fixed retardation in the 
passive elements whereas the analog deflector must generate all of the 
retardation. In addition, the DLD can be designed for any separation 
between the beams and still not require any more than the 207 varia- 
ble retardation. The analog reflector, on the other hand, cannot 
separate the beams any further without supplying additional retarda- 
tion. 


IX. CONCLUSION 


The construction and characteristics of a high-capacity DLD have 
been described, and it has been demonstrated that the number of 
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Fig. 15 — Exposure ratio vs modulator efficiency for a DLD where polarization 
selection is used. 


resolvable positions that can be attained is reasonably close to that 
allowed by diffraction theory. The effect of imperfect modulators on 
the performance of the DLD has also been discussed. 

The discussion presented here does not mention the problems as- 
sociated with the high-speed switching of an electro-optic modulator, 
which is a problem that must be solved if the DLD is to have broad 
application. This problem has been studied by 8S. K. Kurtz.?* 
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APPENDIX A 


Optical Properties of a Wollaston Prism 


The formulas necessary to trace the two wave normals through a 
Wollaston prism in a direction as shown in Fig. 16 are given below: 


sin B = oe 
i y 


n 
tana = —tan 8 
Ne 


| (9) 
sin(@+ 6 = Sn ain (0 + a) 
sina = n, sin e€ 
sin 8 = —sny 
sin (6 + 6) = ~ sin (@ + 8) (10) 


sin b =n, sin 6, 


where the symbols are defined in Fig. 16. 

A useful approximate formula for calculating the total deviation angle 
of a Wollaston prism, A, (A = a + 6b) for perpendicular incidence, 
y = Oils 


A=(la/[+]b]),.=2|/n,—n, |tang+---. (11) 


The variation of A with respect to a variation in y at perpendicular 
incidence, (dA/dy),-0 , can be calculated from (9) and (10) to be 


(24) = cos ners Oe \ (12) 
d¥/ 5 =o cos b cos (6 + 6) cos a cos (6 + 6} ’ 
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Fig. 16-— Diagram for wave normal paths in a Wollaston prism. 


which to a good approximation can be reduced to 


(4) = (n, — ny(4 oe : ) tan 0. (13) 


oY 5 Ms 
APPENDIX B 


Angular Aperture of a Modulator 


This calculation is valid for materials that are cubic, and therefore 
optically isotropic, in the absence of an electric field and become 
uniaxial, with the optic axis parallel to the electric field, in the pres- 
ence of an electric field. 

Fig. 17 describes the placement of the crystal with respect to the 
incident radiation. The vy plane is the first surface of the modulator, 
and the second surface is parallel to the first and passes through the 
point Z equals —T’. The induced C axis of the crystal is parallel to the 
y axis. The light ray makes an angle y with the z axis, and the inter- 
section of the plane of incidence with the zy plane makes an angle a 
with the z axis. The relative retardation between the extraordinary 
and ordinary ray can be calculated to be 
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where n, and n, are the ordinary and extraordinary indices of refrac- 

tion and dA, is the free-space wavelength of the light. Equation (14) 

reduces the familiar (np — ,) (2x7'/d,) for perpendicular incidence. 
Since n, and n, are nearly the same in this case, we will expand 

(14) in terms of powers of (n, — n,) and drop terms containing (n, — 

no)? and higher. We will also expand sin y using a power series in y. 
The result of these substitutions is 





1 - : 
Ree — 0) = { — 2 (cos — 4), 
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Fig. 17 — Coordinate axes showing the placement of the uniaxial crystal with 
respect to the incident radiation. 
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Equations (14) and (15) can be used to describe the interference 
pattern obtained with a uniaxial crystal whose C axis is parallel to 
the crystal surface is placed between crossed polarizers. This pattern 
can be observed in any standard text on optics.** 

In order that a modulator switch the sense of polarization, it is nec- 
essary that the retardation be changed by x. In general, the retarda- 
tion will be changed from Nx to (N + 1) with the application of an 
electric field. The retardation as a function of incident angle for the 
largest retardation, (NV + 1)z, using terms only up to y?, becomes 


ia) Ma Dr re “; (cos a — »y'}- (16) 


The angular aperture of a modulator is determined by the value of 
y where the change in retardation from + becomes serious enough to 
cause unwanted behavior. If we call this change in retardation AR, 
then the value of y that corresponds to this AR can be calculated from 
(16), 1.e., 


n.AR | 
_ lax + 1)r(cos’ a — 4) ]° uo) 
V’rom (17) one can see that the angular aperture for a given material 
is inversely proportional to ~/N + 1 so that we can write 





Ybiased = —tonblased ’ (18) 
VN+1 
i.e., to the same degree of performance the angular aperture of a mod- 
ulator biased to Nz is decreased by the factor 1/WN +1 of the 
unbiased case. 

For a material with a linear electro-optic effect, it is most sensible 
to use N = 0 in order to use the lowest voltages; this will result in the 
maximum angular aperture. For a quadratic material like KTN, it is 
sometimes more efficient to use a biasing de voltage in order to reduce 
the modulation voltage. In that case, an N of 10 or 20 might be used. 
If we decide that the maximum retardation error is AR = 0.02067, 
corresponding to a minimum extinction of 30 dB between polarizers, 
then the angular aperture of KTN is +26° in the unbiased case, +-7.8° 
for N = 10, and +5.7° for N = 20. 

The angular aperture of biased KT'N can be increased by placing 
a properly oriented positive uniaxial crystal such as quartz in series 
with the biased KTN modulator. This technique can be used to elim- 
inate the terms in y* from the total retardation and thereby increase 
the angular aperture to approximately that of the unbiased case. 
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APPENDIX C 


Intensity Distribution in a DLD 


Assume that we have a DLD consisting of n stages made up of 
modulators with an efficiency, EH = a/b, a + b = 1. We again assume 
that only the modulators are imperfect, and that every modulator can 
be characterized by the same efficiency. 

Any light beam incident to a modulator is broken up into two 
beams, an “a” beam for the desired polarization and a “b” beam for 
the oppositely polarized position. A table can be made up which lists 
the total number of paths through the DLD (Table II). Since we have 
a choice at each modulator as to whether an a or b path is taken, the 
different paths are characterized by all possible combinations of the 
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a and 6 terms. Let us consider the paths containing an r number of 6 
terms and an (n — r) number of a terms. The total number of pos- 
sible ways of grouping these terms is given by 


n! 
seal aa a Tae 19 
(n — r)!r! 1) 
and the associated intensity of these beams is a~"b". The total 
number of all paths is then given by summing r through its range 


n! (n—-r) Lr 
<4 (n — r)!r! a ey) 


where the intensity terms have been included. These terms can also be 
generated by 


(a + b)” (21) 
since the terms n!/(n — r)!r! are also the coefficients of the binomial 


expansion. 
From the definition of E, (21) can also be written 


a’(1 is ay (22) 


To derive (22), the DLD was set for one address and the intensity 
at each point was determined. We now ask what are the different in- 
tensities that arrive at a particular position when the DLD is ad- 
dressed to all possible positions. 

In order to address the DLD to every position, the state of the 
modulators can be arranged according to Table III. In this table, 0 
means no change in the state of polarization and 1 refers to a change 
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TABLE IV 


Address of main beam and setting of 
DLD, Ao 00101 


Address of the position at which we wish 
to compare intensity, A, 00010 


Intensity division at each modulator 
between Ao and A, 11 (b/a) (b/a) (b/a) 


Intensity at Ao ae 
Intensity at A, abs 


to the opposite sense of polarization. Table III is a partial listing. The 
complete table is made by first writing the nth column which con- 
sists of alternating 0’s and 1’s for a total number of 2” entries; the 
(n — 1)th column is written by entering pairs (2') of the 0’s and 1’s 
for a total of 2”; the (n — 2)th column by entering 2? of 0’s and 1’s, 
etc. The sequence of addresses in Table III would place the main beam 
once at each location on the focal plane. The 0’s and 1’s that appear 
in any horizontal row is the address of that beam. 

We must now be able to compare intensities between that of the 
main beam, which we shall call A,, and some arbitrary position, which 
we shall call A,. Table IV illustrates the technique. Table IV was 
constructed by using a rule that sets the intensity ratio at 1 for 
modulators that have the same setting and b/a at modulators that 
have different settings. 

We now wish to determine all of the intensities x some arbitrary 
position, say A, = ...010 while the DLD is addressed to all positions. 
Using Table III which lists all of the addresses and Table IV which 
illustrates the comparison rule, we can construct a table (Table V) 
which lists the intensity ratios at A, = ...010. 

Table V is similar to Table III in appearance in that for any 
vertical column a 0, 1 in Table III is changed into 1, b/a or b/a, 1 to 
make Table V. 

Table V is one that lists all possible combinations of the entries 1, 
b/a and therefore is calculable from the same general formula as that 
deduced for Table IT, 1.e., 


(1 nm . 7 (23) 


a 
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It is evident that Table V would not change, except for a different 
ordering of the horizontal rows, no matter what the particular address 
of A,. Therefore, (23) is the same for all points A,. 

Equation (28) will list the intensity ratios between A, and some 
point A,. If we multiply though by the intensity of A, = a”, then (22) 
becomes 


(a + b)" (24) 


which is the desired result. 


TABLE V— List or INTENSITY Ratios at Point ---010 WHEN 
DLD 1s ADDRESSED TO ALL POSITIONS 


Modulator number 


etc (n — 2) (n — 1) n 
1 b/a 1 
1 b/a b/a 
1 1 1 
1 1 b/a 
b/a b/a 1 
b/a b/a b/a 
b/a 1 
b/a 1 b/a 
ete etc. etc 





APPENDIX D 


R Ratio For The Reflection Mode Of Operation* 


A beam of light traveling through the DLD breaks up into 2” exit 
beams due to the imperfect modulators. These intensities are given by 
the terms in the expansion of (a + b)” as shown in Appendix C. We 
now need to ask how much of the light comes back through the 
second aperture after being reflected from the focal plane (see Fig. 9). 

Let us consider one term of the expansion of (a + b)", say a”-"b’. 
We state that this system is reciprocal and that if a”-"b" of the 
incident beam exits the DLD, then if unity power were directed 
through the DLD in exactly the opposite direction the same fraction 
of power, i.e., a®~"b", will pass through the aperture. 

Thus, for an n unit DLD there will be 2” exit terms and each of 
these terms, for example a”b", will generate one term that contributes 


* This appendix represents the results of calculations performed jointly by 
J.T. Sibilia and the author. 
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to the intensity at the second aperture; and for the example above that 
corresponds to (a*-"b")*. The total number of terms that exit through 
the second aperture is then the sum of the squares of the exit terms 
and can be generated by (a? + b?)”. 

Before we can add up the 2” terms in the second aperture, we must 
know something about the relative phases of the terms. Each of the 
2” exit terms in the expansion of (a + 6b)” traverses through a dif- 
ferent optical path length in the DLD. The reason for this is because 
light traverses through some of the prisms as an ordinary ray and 
others as an extraordinary ray, and the combinations of such paths 
are different for each of the 2” exit beams. Thus, unless the DLD has 
been specifically designed to the contrary, each path has a different 
phase delay in passing through the DLD. 

A term in the expansion (a? + b?)” such as a?(-")b* represents an 
E field of [a?-"b*"]? and a phase factor yg. Consider the sum of all 
the n!/(n — r) tr! terms of the type a?-” 6?" 


[arb F@, Rs ee aie): (25) 
The intensity of the sum of all a?-"b?" terms is given by the 
square of (25) 


a" Be, ea ae Ce eee ae (26) 
A series of phase terms such as in (26) can add as follows: 


: n! 
ae a ee eae one ee a ee Gn for random phases 


f 2 
= ae ; for the same phase. 
(near 
As explained earlier, all of the phases are, in general, different, and so 
we will use the random phase addition. The intensity at the second 
aperture is then the sum of all of the terms in (a? + 67)”. 
The £# ratio, the ratio of the light from the main beam, a?”, to the 

light from the remaining positions, (a? + b?)”" — a?", is then 

ae” 

1 1 
“7 vy i ym pi,, 
Prep oe 2 EA 


Rs 


which is the same as that used in Section IV. 
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Transistor Distortion Analysis Using 
Volterra Series Representation 


By S. NARAYANAN 


(Manuscript received January 3, 1967) 


Intermodulation distortion due to nonlinear elements in transistors is 
analyzed using Volterra series representation. It is shown that this technique 
1s well suited for the analysis of transistor distortzon where the nonlinearities 
are small but frequency dependent. An ac transistor model incorporating 
four nonlinearities 1s briefly described. The nonlinear nodal equations of 
the model are successively solved by expressing nodal voltages in terms of 
the Volterra series expansion of the input voltage. Based on this analysis, 
a digital computer program has been developed which computes the second 
and the third harmonic distortion for a given set of input frequencies and 
transistor parameters. The results compare favorably with measured values. 
This method also enables the derwation of closed form ac expressions for a 
simplified model; these expressions show the dependence of distortion on 
frequency, load and source wmpedances, bias currents and voltages, and 
the parameters of the transistor. The technique 1s also extended to cascaded 
transistors, and simplified expressions for the overall distortion in terms 
of the distortion and gain of individual transistors are derived. Finally, a 
few pertinent practical applications are discussed. 


I. INTRODUCTION 


Solid-state long-haul analog communication systems are being de- 
signed for higher frequencies to meet the growth in demand. One of 
the more critical and significant problems facing the system designer 
is intermodulation noise arising from transistor nonlinearities. Thus, 
an analysis of transistor distortion at higher frequencies is a practical 
problem; this paper investigates the transistor distortion using the 
Volterra series as an analysis tool. 

Transistor distortion has been investigated in some detail previously. 
Many authors have considered the exponential nonlinear relation be- 
tween emitter current and emitter-to-base voltage which is important 
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at low currents.''’’***"? The effect of frequency on this nonlinear source 
alone has been reported.” Three nonlinearities (exponential, avalanche, 
and hr, at de) have been examined by Riva, Beneteau and Dallavolta.° 
For currents up to 20 mA and frequencies up to 100 kHz, Meyer’’*’® 
has developed a more accurate and complex model obtaining the non- 
linearities from h-parameters. However, he takes into account the fre- 
quency dependence by assuming that the h-parameters can be written 
as h' +- jwh’’. Moreover, he does not take into account avalanche dis- 
tortion, nor has he extended the model to higher currents (100 mA) and 
frequencies (20 MHz). The model described here considers four non- 
linearities; they are, exponential, avalanche, hy, , and collector capaci- 
tance nonlinearities. These nonlinearities are superimposed on a linear 
ac equivalent circuit.’°’’* Much of the initial development of the model 
with three nonlinearities was done by Thomas.’° 

The transistor model is analyzed using a Volterra series representa- 
tion; this series is a generalization of the power series. In a now classic 
report, Wiener applied this analysis technique” to find the response 
of a nonlinear device to noise."* Bose has carried the theory further.’ 
Following a series of lectures by Wiener,’’ the theoretical framework, 
higher-dimensional transforms, and optimization with Gaussian inputs 
were considered by Brilliant,’° George,’’ and Chesler,’® respectively. 
Barrett’” has treated statistical inputs. The synthesis problem has been 
examined by Van Trees,’ who also applied the method to phase-locked 
loops.” The technique has been extended to discrete systems,”’”’”* and 
a class of time-variant systems.”*’’’ More recently the theory of the 
convergence of the series has been treated.”® This work relies more on 
George’s work on the higher-dimensional transform theory.” 

Even though much work has been done in this area, the Volterra 
series has not found a wide application in solving nonlinear system 
problems due to several reasons; if the rate of convergence is not rapid, 
the higher-degree terms, which are cumbersome to handle, cannot be 
neglected; hence, it cannot conveniently represent gross nonlinearities. 
It is not simple to invert the multidimensional transforms to the time 
domain, and it is not a useful technique to determine the stability of a 
nonlinear differential equation. 

The Volterra series method does, however, offer certain distinct 
advantages in analyzing transistor distortion. Since transistor distortion 
is frequency dependent, the power series is inadequate to characterize 
it; the Volterra series does indeed represent frequency dependent sys- 
tems. The nonlinearities in the transistors under consideration are 
extremely small so that the second- and third-degree terms suffice to 
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characterize them. Since the output corresponding to sinusoidal input 
signals is of interest, there is no need to find the inverse of the higher- 
dimensional transforms; the output can be expressed in terms of the 
transform of the kernel. The higher-dimensional transforms of the 
kernel are complex numbers when s; = jw; , where s; is the complex 
variable in the transform domain; hence, these kernels can be numeri- 
cally evaluated using the computer (see Section IV). Moreover, for a 
slightly simpler model closed form ac expressions can be derived. Since 
the kernels retain phase information, this approach will be useful for 
the AM-to-PM conversion problem at IF frequencies. Finally, in an 
amplifier two or more transistors are cascaded; the nonlinear behavior 
of such cascaded transistors is a significant problem. The Volterra series 
approach can be easily extended to study such cascaded transistors. 


Il. AN INTRODUCTION TO VOLTERRA SERIES REPRESENTATION 


A brief exposition of Volterra series with pertinent reference to the 
problem under consideration is presented below. For further details 
the reader is referred to the references cited. 

Consider a simple memoryless nonlinear system described by the 
following power series; let y(¢) be the output and x(t) the input; the 
system is represented by 


y(t) = e x(t) + es[x(t)]” + cla)’, (1) 


where ¢, , C2 , C3 are constants. For a time-invariant system with memory 
(capacitors and inductors in an electrical network), the linear term 
(¢,2(t)} is replaced by the convolution integral (a(t) = 0; 4 < 0) 


nd = ff ext = aul) ar, (2) 


In the transform domain, (2) may be written 
Y,(s) = Ci@)X@). (3) 


This transform domain representation of the system [C,(s)] has been 
an invaluable aid to the communication engineers since it brings into 
focus the frequency behavior of the system. 

A generalization of the second-degree term, c.[x(t)]’, is the double 
convolution integral 


ie [ [ ee ae I CS Wein (4) 
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The output depends on the past values of the input; the above expression 
involves a product of the input with itself, thus representing a quadratic 
system. ¢2(¢ — 7; , £ — 72) is known as the second-degree Volterra kernel. 

A two-dimensional Laplace transform can be defined for (4) after 
introducing dummy variables ¢, and ¢,. As shown in Appendix A, (4) 
becomes 


Yo(Si , 82) = C2(S, , So) U X(S;). (5) 


When two sinusoidal signals at frequencies f, and f, are applied 
(Appendix A), the output at the harmonic frequency f, + f, is given 
by {| Co(fa + fo) | cos (27(fa & fo)é + dazs)]. Since in general C2(f. , fz) 
will not be equal to C.(f, , —f,), different values of distortion at different 
harmonic frequencies are directly reflected in the kernel. Moreover, as 
in the power series case, the 2f product is less by a factor of two. 

Likewise, the third-degree term [c3(x(r))*] can be generalized to a 
triple convolution integral; 


wo=f ff G@-n,t—m,t- 1) [ated dn. 6) 


In the transform domain (6) may be written 


3 
Y3(81 , 82 , S3) = C'3(S; , Se , 83) I] X (si). (7) 


The magnitude of the signal at the harmonic frequency f, + f, — f. 
due to the three fundamental signals at f, , f, and f, is given by | Cs(f, , 
fs, —f-) |. The constants like 1/4 for a ‘8f,’ product are the same as 
obtained from the power series approach. | 

Later in the paper (in Section IV) the cascade relations in the trans- 
form domain are frequently used; their physical significance is discussed 
in detail in Section VI. (See also Fig. 1.) The cascade formulae and the 
procedure for deriving them are given in Appendix A. 

The second and third harmonic distortion are defined as the second 
and third harmonic power in dBm, respectively, when the fundamental 
power at the output of the transistor is at zero dBm (one milliwatt). 
In the analysis of the model in Section IV, the output voltage is ex- 
pressed in terms of a Volterra series of the input voltage. Thus, the 
kernels C,(s:), Co(s:, S2), and C3(s,, S, Ss) are the voltage transfer 
ratios; for a given load R&, , the second and the third harmonic distortion 
in dBm are given by the following expressions: 
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a 
E3(S, So, Ss) 
y= D(x) Z=E£(y) 


F3(S1,S2, $3) 
Z= F (x) 





Fig. 1— Two cascaded systems. 


III. THE JUNCTION TRANSISTOR NONLINEAR MODEL 


A model is a simple but realistic representation of a physical phe- 
nomenon in terms of measurable parameters such that the phenomenon 
can be analyzed, and controlled if possible. The linear equivalent circuit 
of a transistor is one such example. In reality, several elements of the 
transistor equivalent circuit are not linear but are linearized versions 
of nonlinear functions; they are the first-degree terms of the Taylor’s 
series expansion of the nonlinear functions. Hence, a logical way to 
develop the nonlinear model is to consider the second- and third-degree 
terms of the Taylor’s series expansion; thus, the emitter resistance 
(exponential nonlinearity), current gain (hp, and avalanche nonlin- 
earity), and the collector capacitance (collector capacitance nonlin- 
earity) have been represented by nonlinear voltage dependent current 
generators whose parameters are higher-degree Taylor’s series terms. 
This approach has another advantage in that it is difficult to measure 
the nonlinearities since they are small; but, it is not too difficult to 
measure the overall functions and to curve fit with the known theoretical 


* The factors } and } normalize the distortion to 2f and 3f products. 
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Fig. 2— Common-emitter nonlinear equivalent circuit. 


— ieee, 


or empirical relations. These nonlinearities are superimposed on the 
linear equivalent circuit (Fig. 2). The nonlinearities are described next. 


3.1 Laponential Nonlinearity 


The emitter current, J; , is related to the emitter voltage, V., by 
the exponential relation 


eS Al exp (, ve) — 1 + B, (10) 


where K = Boltzmann’s constant, 


electron charge in coulombs, 


| 


q 


T = Temperature in degrees Kelvin, 


and A and B are constants which depend on the transistor parameters 
(Ref. 27; p. 181, p. 249). An experimental curve of the emitter current 
I, and the emitter-to-base voltage V,. 1s shown in Fig. 3. This non- 
linearity is expressed as a voltage-dependent current generator by a 
Taylor’s series expansion of (10) as follows: 


i, = K(vy) = Ky. + Kz + Kyo , (11) 


where the Taylor’s series coefficients are derived in terms of known 
parameters, the emitter resistance r,, and the emitter bias current J; ; 
1.€., 
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7 (12) 


1 


3.2 Avalanche and hpy Nonlinearity 


The collector current is a nonlinear function of the emitter current 
at higher values of current (hr, nonlinearity) and of the collector-to-base 
voltage at higher values of voltage (avalanche nonlinearity).”’ hy, , 
the ratio of Ig to Ig, is plotted as a function of collector current J, 
in Fig. 4. It is seen that the following empirical relation” matches the 
experimental result (Fig. 4): 


h max 
her = oa ae pe ; (13) 
l+a log’ T c 


C max 





where herp max 18 the maximum value of hrz , Ic max 18 the value of Ig 
at which hyg max occurs, and a is a constant. 

The avalanche nonlinearity is due to avalanche multiplication which 
occurs at higher collector-to-base voltage. It is determined from the 
collector characteristic which is a plot of collector current (J,) and 
collector-to-emitter voltage (Vex), (Fig. 5). The empirical Miller’s 
avalanche multiplication factor is given by 


1 


Vero 


(14) 


CURRENT IN MILLIAMPERES 





0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 
VOLTAGE IN VOLTS 


Fig. 3 — Exponential nonlinearity — measured curve. 


998 THE BELL SYSTEM TECHNICAL JOURNAL, MAY-JUNE 1967 


120 
TRANSISTOR A- 2436 


100 7 
CALCULATED ~ 
a= 0.38 
sh I. MAX = 634 
Us 
u MAX = 122 
c hee 
; 1i- 
: IL 
Lee 


\ 
EA 
hk 





Fig. 4—Apg nonlinearity — calculated and measured curves. 


where V¢zo is the sustained voltage, and the exponent n is determined 
by experiment. From expressions (13) and (14), the ratio IgusIz is 
given by 





I¢ her max 1 

a —— a . 3 15 

Ip 2 Ig Ves i ( 
] hrg max | a log Le 1 a Vaee 


where Viexo = Vero/nvV1 — a and Ves = Veer F 
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Fig. 5— Avalanche nonlinearity — calculated and measured curves. 
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The ac 7, can be expressed in terms of 2, and v,,[v3 — v,] by a Taylor’s 
series expansion of (15). Since 7, is a function of emitter voltage v, , 7, 
is represented by a current generator g(v2 , v3 — v1); for convenience in 
notation it is separated into a linear term g,(v2., v3 — v,), a second- 
degree term g2(v2, v3 — 0) and a third-degree term g3(v2, v3 — 2). 
The linear term equals M o(arte )2 + M, (v3; — v,). The second-degree 
term is given by a.M/,K? (v2)" + mo(v3 — 1)? + (a,M,)Kyvo(v3 — 04). 
The coefficients a, , a2, MZ,, m2, etc., and the third-degree term are 
given in Appendix B. 


3.3 Collector Capacitance Nonlinearity 


The collector capacitance is a nonlinear function of collector-to-base 
voltage (Veg) since the depletion layer width is a function of Vo, . 
The exact functional relationship is determined by plotting the common- 
base imaginary part of h.. as a function of collector-to-base voltage 
(Vez) as shown in Fig. 6.° It is evident from the figure that C, follows 
the 1/3 voltage law (Ref. 19; Equation 5-96); 





-1/3 
C. = k(Vecn)””. (16) 
11.5 
| 
- th 
108 \ 
COMMON BASE 
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Fig. 6— Collector capacitance nonlinearity — calculated and measured curves. 
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This nonlinearity is represented as a frequency (differentiation) and 
voltage-dependent current generator as follows: 


te, = (v3 — v1) 
d d d 
Si dt 3 — 01) + Ye dt (v3 — v1) 3 dt (v3 — v4), (17) 


where y, = C, , and where y, and y3 are known from (16). 

The above nonlinear current generators are incorporated in the linear 
equivalent circuit as shown in Fig. 2. The linear equivalent circuit 
parameters are obtained from the equivalent circuit characterization. 
They can, for example, be computed from the h-parameters at different 
frequencies. In general, the distortion is not a critical function of the 
linear parameters. (Figs. 14 to 17). 

All the nonlinear coefficients (K, , a: , m2 , etc.) are easily obtained 
from a simple computer program. The parameters to be specified along 
with typical values for transistor type A-2486 are listed in Appendix C. 


IV. THE VOLTERRA KERNELS FOR THE NONLINEAR MODEL 


The Volterra series method is applied to the model to compute the 
second and the third harmonic distortion. The voltage at each node is 
a nonlinear frequency-dependent function of the input voltage. Each 
nodal voltage is expressed by a Volterra series expansion of the generator 
voltage; since the nonlinearities are small only three terms are con- 
sidered. The kernels at each node are determined from Kirchoff’s current 
equations. 


4.1 Nodal Equations 


The Kirchoff’s current law is applied at each node; the currents are 
next expressed in terms of the generator voltage v, , the three nodal 
voltages v, , ve , and v3 , and the known linear and nonlinear parameters. 
The impedances are represented by their transforms and o denotes 
that it operates on the voltage across it. The nodal equations are given 
below. 


ZH oe —w) + (Cs) Os — 1) = OC) on +2 00, —H), (18) 


2 0 (v, _ Vo) = K (v2) = (sC) OV. + (4) 0 (V2 — U3) = y(U3 = Vo) 
(19) 


a gv. » v3 — v1), 
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1 
—y(v3 — 0) + i (v2 — vs) — gle , 03 — 01) = (sC3) 0 (vs — 04) 


1 
+ (at Ov 
Z (8) “ 
where K(v.), y(v2 — v3) and g(v., v3 — v,) are the nonlinear current 
eenerators. 


(20) 


4.2 Solution Using Volterra Series 


Since each nodal voltage is to be expressed in terms of three Volterra 
kernels, there are nine unknown Volterra kernels to be determined from 
the three equations. The problem of solving for nine unknowns from 
three equations is resolved by noting that the polynomials x, x” and 
z are linearly independent; hence, each degree term is separately 
and successively solved. The linear kernels are first determined; then 
the second-degree kernels are determined in terms of the linear kernels; 
lastly, the third-degree kernels are evaluated in terms of the first- and 
second-degree kernels. 

Let A,(s), Bi(s), Ci(s) denote the transforms of the linear kernels 
at nodes one, two and three, respectively. From the nodal equations 
(18) to (20), the following vector matrix equation is derived. 


| A,(s)| 





| 














L 
| i 1 (s) | ’ (2 1) 
| 0 C; (s) 
where 
1 1 1 
—-— +8(C3+C1) + — ae ; —sC3 
Z3(8) rd rb 
1 1 1 1 
- Pr(s) = ——+m ——~--sC2-+ +Ki(1 —a@) +sy1 _ —mi sy 
Th Tb Te Te 
1 1 1 
— sC3 — mi _ +aiki-—sy1 —— +8C3+ +m+sy1 
Te Te Zr 8) 
(22) 


Equation (21) is solved by inverting matrix Pz(s) and post-multiply- 
ing by the vector 
1 
Z,(s) 


lo] 
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For a given frequency s = jw, the computation is done numerically. 

The second-degree terms are equated next in (18) to (20). There are 
two types of second-degree terms; those arising from the unknown 
second-degree kernels [for example, (s; + s,)C,A2(s;, S2)] and those 
arising from the known nonlinear coefficients and the known linear 
kernels [for example, K. T{_, B,(s;)]. The terms associated with the un- 
known second-degree kernels are the same as were associated with the 
unknown linear kernels in (21), but at the harmonic frequency (s; + $2). 
The following vector matrix equation is obtained for the second-degree 


kernels: 


0 A2(s; , 82) 
[9.(B, , C, — Ay) B.(8, , S82) 
+y2(Ci — B,) — K2(By)]p = Pals, + 82) » (23) 
[—9.(B, ,C, — A,) C'2(s1 , S2) 


— 4. (C, a B,)] 


where g. and fz represent the second harmonic contribution due to 
Jo(V2 , V3 — ¥,) and y.(v3 — v2); hence, 


G.(B, ; Cy = A) = la, kK» —- a2 KK] Il B,(s;) 


+ SAEs [B(es)ICs(62) — Arlo) 
+ By(se)(Ci(si1) — Ai(s:)]] 
+ Me I] [Ci(s:) — A,(s)] (24) 
KB, = K, [I Bc, | (25) 
4(C, — B,) = K.(s, + S2)Ve II [C1(s,) a B,(s;)]. (26) 


7=1 


P,(s, + 82) is the matrix Pz(s) with s replaced by (s, + 82). 

The vector on the left side of (23) is known. Thus, the unknown kernels 
are determined by inverting the matrix P;(s, + s,) and post-multiplying 
by the vector on the left-hand side of (28). When s, = jw, , the inversion 
of the matrix and the post multiplication by the vector can be done 
numerically. 
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The procedure for obtaining the third-degree kernels is almost the 
same; the significant difference is that the vector on the left side not 
only contains terms arising from the third-degree nonlinear parameters 
but also includes second-degree coefficients which give rise to third- 
degree terms by the interaction of the first- and the second-degree 
kernels. These interaction terms are denoted by Koz , Jos » Yo3 , respec- 
tively. For example, K3(B,) = K; II}_, Bi(s;), whereas K.3 = 2k.B,(s,) 
B.(s2, 83) which shows the interaction of the first- and the second- 
degree kernels. The third-degree kernels are derived from the following 
equations: 


0 
[gs(B1 » Cy — Ar) + (Ci — By) 
— K.(B1) + gos + Yes + Bos] 
[—93(B, , Ci, — Ai) — %(C. — By) 
— G23 — Fas] 
A3(Si , Se , 83) 
= Px(s, + 8s. + 83) ) Bas: , S , 837 , (27) 
si » Sz, vol 


where @3 , Jos are given in Appendix B. 

A computer program has been developed which calculates the kernels 
and the second and the third harmonic distortion. It uses existing pro- 
grams to invert the matrix P,(s). The nonlinear coefficients are com- 
puted from the known and measured parameters. Computed and meas- 
ured results at different currents are given in Fig. 7. The program has 
been extended to common-base and common-collector configurations. 


V. SIMPLIFIED DISTORTION EXPRESSIONS, THEIR PHYSICAL SIGNIFICANCE 
AND COMPARISON WITH EXPERIMENTAL RESULTS 


Another advantage of the Volterra series method is that it permits 
derivation of closed-form expressions for second and third harmonic 
distortion. These equations show the interaction between the various 
nonlinear parameters and the effect of frequency. 

The model includes the base resistance (r,), the emitter resistance 
(r.,), the diffusion capacitance (C,), the load (R,) and the source im- 
pedances Zg(s), and three nonlinearities, namely, exponential, ava- 
Janche, and hy, nonlinearities. In the computer program Cy, Ci. , 
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Fig. 7— Comparison of experimental and computed results. 


r., C,, m, and collector capacitance nonlinearity have been taken into 
account. The expressions given below are for the common-emitter 
configuration. 

5.1 The Second Harmonic Distortion Term 


The second harmonic distortion in dBm (8) is given by 


10° | (r, + Z,(s))-(K, + sC.) +1 | 
R, | Li + Z,(s)-(Kid — a1) + sC.] +1 


| 2 9 SCn To ; ( Ct) 
| e, m(R, 7G =) we I] ae a 


+ Ks _K Ks, ( (Z,(s) + 1,)-sCo + 1 ) | 
(Z,(s) + 7,)-(K, + sCo) + 1 


where S; = jw, , S = +jw, ands = jw, + Ju. 








M,,., ~~ 20 log 5 9 


(28) 





5.2 The Third Harmonic Distortzon Term 


In the third harmonic distortion term given below, the interaction 
terms due to the first- and the second-degree kernels have not been in- 
cluded mainly to reduce the complexity; in certain cases, they may be 
significant. 


10° 





M ~~ 20 log 


Saxbuae 





1 
®4 R r, + Z Oy: (K, a as) aC) ed 
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| es - Mm (Ri 2 RrsCzr, il 1 s,s; soe) TT I (r, + s,Cor _ 








3 ak, 3 (aK, )" ak, 
i Ks alr, + Z,(s))(sC2) + 1] iti Zonk 
(a,K,)° (r, + Z,(s)):(sC2 + K,) + 1° (am)*Kj 
»M, s Cor, | | | 
=, Te)! Ee ete ° 3 |] , (29) 
where 8, = jw. , S2 = jw, , $3 = +jw, ands = s, + s. + s; and 8:8, = 


S182 + S283 + $38; . 


5.3 Physical Interpretation of the Distortion Terms 


The interaction of different nonlinearities and their dependence on 
load impedance, source impedance, bias currents, bias voltage and 
frequency is indeed somewhat complex. However, the closed form ex- 
pressions derived above give a general qualitative picture which will 
be discussed now. 


5.3.1 Liffect of Frequency 


It is important to know the effect of frequency on distortion. The 
distortion depends not only on the frequencies of the fundamental tones 
but also on the harmonic frequency of interest. As shown in Fig. 8, 


3f (EXPERIMENTAL RESULTS; 
A2436 NO. 27) 
wa 
of itfo-f 
f,+ fo- fs 
RL. =50n 


=100 













“(py | U99MAS 20vs Rg= 2280) | (100 MA; 20Vv; eo] | oomas zo¥s Ry -aasn | _fe 
SS 
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Ms,Mz3 IN dBm 


(f: FUNDAMENTAL FREQUENCY, f,+ f2,f, + fo-f3 
AT HARMONIC FREQUENCY ) 
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Fig. 8— Variation of Af, , M3 with frequency. 
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M, due to a + b product is better than M, of a — b product by 10 dB 
with the two tones at 8.382 and 7.266 MHz. These measurements were 
made with the transistor biased at 100 mA, 10V and with R, = 50Q 
and R, = 50. In curve (a) of Fig. 8, the fundamental tone was in- 
creased from 2 MHz to 10.5 MHz and signals at 2f and 3f were measured. 
It is seen that both M7, and M; improved with increase in frequency. 
A theoretical explanation on the basis of dominant terms (in this range 
of parameter values) in (28) and (29) is given below. In (28) as well as 
in (29) the terms in brackets are multiplied by a frequency-dependent 
term 


(7, + B,)(Ky + sCz) + 1 : 
(7, + &,)(K,(1 — a) + sC2) + 1 


In this range of frequency (s = harmonic frequency), if K,(1 — a,) S 
| sC, | and if | (r, + R,)sC, | > 1 but K, > | (sC2) |, then the above 
term reduces to K,/sC, which decreases with increase in frequency. 
However, the avalanche terms (M 2, M3, etc.) involve the terms sC, 
[in (28) and (29)] and S;C, in the numerator. Thus, if the avalanche 
terms are dominant, as at higher voltages, there should be no net con- 
tribution due to avalanche terms alone. The exponential terms (K.,/(K,)° 
and K;/(K;)*] are multiplied by the factor 


(rs = R,):sCy ++ 1 . 
(7, + f,)- (Kid — a1) + sC2) + 1 


This term is independent of frequency if (sC.(r, + #,) + 1) > 1. 
Thus the above discussion shows that distortion will improve with 
increase in frequency at lower voltages and if | sC.(r, + R,) +1 |< 1. 
To verify this statement, the voltage was increased to 20 volts and the 
input resistance changed to 225. The plots of M, and M; with fre- 
quency, as measured, are given in curves labeled (b) in Fig. 8. It is 
seen that 17, and M; do not improve with increase in frequency. The 
small improvement can be attributed to the hz terms. 

In general, increase in frequency increases distortion; this is especially 
true for the common base configuration. But as shown above, for certain 
ranges of frequency and certain values of source impedance, distortion 
can improve with frequency. 


5.3.2 Liffect of Load Resistance, Ry 


The load resistance is an external parameter which the circuit designer 
can vary; hence, it is useful to know its effect on distortion. The second 
and the third harmonic terms are multiplied by 1/V R, and 1/R, ~ 
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terms, respectively; it shows that the distortion can be reduced by in- 
creasing R,. However, the avalanche terms M@,, M., and I, are 
multiplied by the R; term, so that increasing R, will increase the con- 
tribution from the avalanche terms. Thus, an increase in R, may in- 
crease distortion or reduce it due to cancellation. (The contribution from 
the collector capacitance terms also increases with increase in load (fz).) 
Because of the above interaction, for a given set of parameters and 
input frequencies and the harmonic frequency of interest there exists 
an optimum load RP, ; this, of course, can be determined using the com- 
puter program. In Fig. 9, the measured values of MM, and M; at different 
values of R; are plotted; in both cases increasing R, reduces distortion 
until the optimum value is reached and then distortion increases with 
increase in Rf, . 


5.3.3 Effect of Source Impedance, ZAs) 


Source impedance is another important external parameter. The 
source impedance affects the exponential nonlinearities K,/K{ in (28) 
and K;/K? in (29) by the factor 


(r, + Z,(s))sCz + 1 
(r, + Z,(s))-[sC2 + KiA — a@)) + 1 


At low frequencies, this nonlinearity is reduced by the factor 1/[(1 — a) 
(R, + r,)K, + 1]. Thus, an increase in R, will reduce distortion from 
this source. However, the contribution from other nonlinearities are 
increased by | 
1+kK iE, a 1's) . 
i he, ad ep 


(De ee 4 
A2436; NO.27 


(EXPERIMENTAL RESULT ) 
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Fig. 9— Variation of M2, M; with load resistance. 
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It is seen from this expression that if K,(R, + 7,)(1 — a,) is greater 
than 1, the other nonlinearities are not affected by the increase in R, . 
Thus, if the exponential term is dominant, increasing #, reduces dis- 
tortion at low frequencies. At higher harmonic frequencies if 
| sC.(Z,(s) + 7r,) | > 1, the distortion terms are independent of the 
source impedance since the [r, -+ Z,(s)] term in the numerator and 
denominator cancel. This is well illustrated in the measured results of 
Fig. 10. The second harmonic frequency being 0.7 MHz, | sC.(R, + 7) | 
is not much greater than one up to R, = 1000; hence, the second har- 
monic distortion improves with increase in source resistance up to 
140Q. Further increase in ?, does not cause much change in distortion. 
The third harmonic frequency 1s 17.3 MHz; hence, a change in R, does 
not affect M7; appreciably. (| (sC.)-(R, + 7.) | > 1) 


5.3.4 Effect of Bias Current 


Increase in bias current usually reduces distortion due to the following 
reasons. The increase in emitter bias current reduces the exponential 


terms 
Ke ( .) aq ks | 1 
Tyr v2 \QAa > er \3 Aa Fy NS |? 
(1,) Ty (K,) (In) 


Fig. 11 shows the effect of bias current on hr, terms; a, decreases 


with increase in J¢ by 
ee es 
Le log I¢ max ; 


ey ee ee ee 
eM a ees 


FUNDAMENTAL TONES 1{6.6,15.2, 14.5 MHZ 
f; + fo - f3 = 17,3 MHZ; fo - f; = 0.7 MHz 
(EXPERIMENTAL RESULT) 
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Fig. 10— Variation of M., M43 with source resistance. 
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it becomes zero at Ig = Ig max/e, and then becomes negative, and 
increases with further increase in J, . The coefficient a; decreases with 
bias current J,,. Thus, in general, an increase in bias current has the 
effect of reducing both the second and third harmonic distortion (Fig. 7) 
(at least until a. = 0). 


5.3.5 Hffect of Bias Voltage 


Whereas exponential and hp, terms are functions of bias current, the 
avalanche and collector capacitance nonlinearities are affected by the 
bias voltage. The coefficient 7, increases with the voltage; but 17, and 
M, increase much more rapidly (Fig. 12). (Both the collector capacitance 
nonlinear coefficients y. , y3 decrease with the increase in bias voltage.) 
The effects of change in bias voltage are especially pronounced at higher 
load resistance since avalanche (and collector capacitance) terms be- 
come dominant. The third harmonic distortion decreases more with 
the increase in voltage (Fig. 18) than the second harmonic distortion 
does. 
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Fig. 12— Variation of M,, M2, M3 with bias voltage. 
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Fig. 18 — Variation of Mz, M3 with voltage at 100 mA. 
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The physical significance of the closed form expressions has been 
qualitatively discussed. Precise quantitative estimates can and have 
been obtained using the computer program. For example, the effect 
of varying linear parameters by fifty percent of their original values was 
studied. The results show that the distortion does not critically depend 
on the linear parameters (Figs. 14 to 17). The other transistor parameters 
such as Ig wax, Vezo, 1, etc., can also be varied. 


VI. ANALYSIS OF CASCADED TRANSISTORS 


It is often stated that in a multi-stage amplifier, the output stage 
alone determines the over-all distortion. Even though this statement 
is true to some extent, it is frequently found in practice that the effects 


FUNDAMENTAL TONES 16.6, 15.2, 14.5 MHZ 
f, + f2-f3 = 17.3MHZ; fo- f3 = 0.7 MHZ 
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Fig. 14— Variation of M2, M3 with rp. 
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Fig. 15— Variation of M2, M3 with C2. 
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Fig. 16— Variation of Mo», A473 with r.. 
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Fig. 17 — Variation of M5, M3 with r.. 


of the previous stages cannot be ignored and sometimes the previous 
stage is dominant. This is especially true if both minimum noise figure 
(which requires lower bias current) and modulation requirements are 
to be met by a two stage amplifier. Two analysis tools based on Volterra 
series are presented here which enable the study of such cascaded stages. 

The first approach makes use of the cascaded formulae mentioned 
earlier; this method illustrates the cascade phenomenon and permits 
derivation of simple cascade rules. 

Consider two cascaded transistors (Fig. 1); let the output voltage 
(v2) of the first transistor be denoted by D(v,); the output voltage (v3) 
of the second stage by H(v.) and F(v,). The aim is to compute the kernels 
Fi(s1), Fo(s, , 82), F'3(s; , 82, 83) knowing D and E. To calculate D(s;), 
etc., it is necessary to know the load impedance of the first stage which 
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is the input impedance of the second transistor. This can be computed; 
thus, for a given generator impedance and bias conditions, D(s,), 
D2(s1 , 82), D3(s; , 82 , 83) can be determined. H(s,), H2(s; , 82.) and £3(s; , 
So , 83) can be computed for a given load and bias conditions with R, = 0 
(voltage v. is directly impressed across the second). Now expression 
v3 in terms of v, is given by 


v3 = FQ,) = E@:) = E(D@,)) = & o D)Q,). (30) 


It is seen that F is related to H and D by the cascade formulae whose 
physical significance is discussed below. 


6.1 Linear Term 


The linear term is given by 
F,(s) = D,(s)E,(s) (31) 


which states that the overall gain in dB is the gain of the first stage 
in dB plus the gain of the last stage in dB. 


6.2 Second Harmonic Term 


The second-degree kernel is given by 
2 
Fo(81 , 82) = By(s; + 82)Do(s; , 82) + E2{s, , 2) II Dy(s:). (82) 
t=] 


The first term of the formula states that a given harmonic product 
from the first transistor D,(jw, + jw,) is amplified by the second transis- 
tor at the harmonic frequency E,(jw, - jw,). The second term shows 
that the two fundamental tones are amplified by the first transistor 
[D1 (jw,)D,(+jw,)] and then the second transistor acts on these tones 
to produce distortion F,(jw, , jo). 

Equation (32) is related to the second harmonic distortion (M_) 
as follows: 


/10°R, 
2 


P,(s, 5) So) 
F’,(s,)F',(S2) 
V10°R;, D.(81_, 82)} Ey(s: + 82) E4(s; , Se) 
2 Ti (a \F ve.) can 

: II Dus.) Bi(s:)4 (2) TJ £.ls,) 








M, = 20 log (33) 


= 20 log - (34) 


The second term is the second harmonic distortion of the last stage. 
The first term expresses the contribution from the first stage; it approxi- 


1014. THE BELL SYSTEM TECHNICAL JOURNAL, MAY-JUNE 1967 


mately equals 


pe stage second ee — | gain of the last stage (35) 
distortion in dBm in dB 


This shows that if the gain of the last stage is high, the contribution 
from the first stage is small. Equation (85) is approximate in two re- 
spects; it neglects the frequency effects and the phase addition of the 
contributions from the first and the second stage. In (85), the second 
stage gain in question is actually the ratio 


E(s; + 82) 
E,(8,)E(S2) 


which involves the two fundamental and the harmonic frequencies. 
As an example, a shaping network which was introduced increased the 
gain (18 dB) at the harmonic frequency (0.7) MHz) and decreased the 
gain at fundamental tones 15.2 MHz (8 dB) and 14.5 MHz (8 dB) 
with the result the overall distortion was poorer by 34 dB. 


6.3 Third Harmonic Distortion Term 
The third harmonic kernel F'3(s, , s2 , $3) 18 given by 
F3(s, » 82 $3) = Eiy(s, + 8 + S3)D3(s, » 82 5 Ss) + 22,(s, , 82 + 83) 


-D,(81)D2(82 , 83) + E3(S1 , 82 , 83) Il D,(s;). (36) 


The first term shows that the third harmonic product of the first stage 
[D3(si , S2 , S3)] is amplified by the last stage at the harmonic frequency 
[E,(s, + se + 83)]. The second term is the interaction term; it arises 
when the second-degree kernel of the last stage [E.(s, , s. + s3)] acts 
on the sum of the fundamental [D,(s,)] and the second harmonic output 
of the first stage [D.(s2 , s3)]. The last term shows that the second stage 
third-degree kernel [E3(s, , s2 , 83)] acts on the fundamental tones am- 
plified at the respective frequencies by the first stage [D,(s,)D,(s2)D, (ss) |. 

From (86), the overall third harmonic distortion is related to that 
of the individual transistors by 














M, = 20 log + 107%R, | |S + ef 80)| | PalGs s 82 » Be) 
i: 4 3 3 
I] E,(s;) I] D,(s,) 
41.9 Eo(s1 , 82 “+ Ss) 1 D,(82_, 83) ae E3(8y 89-5583) ; (37) 
2 E, 2 3 
Tee) | II Deo IL Bus.) 
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The first term is the contribution from the third harmonic term of 
the first stage; it is given approximately by 


bess harmonic reese] — 2 pose of the - (38) 
of the first stage in dBm stageindB |. 


The interaction term approximately equals 


beers harmonic | -- pew harmonic erage 
of the first stage in dBm of the second stage in dBm 


+ 6dB — prs of the al (39) 
stage in dB 


The third term in (87) is the third harmonic distortion of the last stage 
in dBm. 

It is seen that the effect of the first stage and the interaction term 
can be reduced by increasing the gain of the last stage. Equation (389) 
illustrates that the second harmonic distortion of each stage should be 
good. This may become a limitation if the first stage is biased at lower 
currents. 

In the above simplified expressions [(88) and (39)] phase addition and 
frequency effects have not been considered. In (88), 2 (gain in dB) 
actually represents 


E Gwe sk Jory = jw) 


BONDE: | Go Ce pasa) | 


In (39) the second harmonic distortion is to be measured with two 
tones, one at the fundamental and the other at the harmonic frequency 


Ei(S) , 8 + S3) 
Ei,(8,)E,(S2 + 83) 


and then multiplied by the ratio of the gain 


Ei (82 + 83) ; 
E,(8;) 


Moreover, the kernel must be made symmetrical by taking the average of 
three possible combinations. 

Thus, the simplified expressions (35), (88), and (39) are exact if the 
transistor performance is not frequency dependent; in general, they 
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can be used to get a qualitative picture. Equations (34) and (87) are 
indeed exact and take into account the frequency dependence. The 
computer program is being extended to calculate (34) and (387). 

An alternate approach to calculate the distortion of cascaded stages 
is to analyze the nonlinear equivalent circuit of cascaded transistors 
using the nodal technique illustrated in Section IV. The nodal equations 
are derived first; next each nodal voltage is expressed in terms of the 
Volterra series of the input voltage; the resulting vector matrix equations 
are successively solved. Since the procedure is similar, the details are 
omitted. 

Two common-collector stages were cascaded using this approach. 
(Fig. 18). The measured values at 120 mA, 10 V with 75 ohm source and 
load impedances were —87 dBm and —112 dBm for the second and 
the third harmonic distortion, respectively. The computed distortion 
values are —88.7 dBm for second and —116.6 dBm for third harmonic 
distortion. Thus, good agreement with experimental result 1s obtained. 

The cascade formulas are simple, physically meaningful and yield 
rules of thumb to judge the effect of the first stages. The nodal approach 
is more complicated. However, the advantage of the nodal approach 
is that it is general and can be used for an amplifier. For example, a 
cascade of common-emitter and common-collector stages involves five 
nodes; if shunt feedback is used at the input and at the output, the 
same program can be used to analyze this amplifier. (Cascade formulas 
do not take feedback into account.) In general, the nodal approach can 
be extended to study frequency-dependent nonlinear network with 
n nodes, if the nonlinearities are small. 


C L C R 
; 1 K (Vo- Va) Cy K(V4- Vs) 
: Can Ori 
“Aw 
bg "bh Vp "b vg 
Vi 
Vy V. —» 
a 3 “a AN 
b Ls Ub 


UL 


Cs é, 
C3 
. [& se Ol (vy) Sk OM Sy, 
vg(~) 1(-V2) $(-Na) 


9 (Va -V31- Vj) g (Va - Vs - Va) 


Fig. 18 —Common-collector — common-collector nonlinear equivalent circuit. 
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VII. ENGINEERING APPLICATIONS 


A few pertinent practical applications of the work are described 
below. These results were either first predicted by the model and then 
verified in the laboratory or first experimentally observed and then 
confirmed by analysis. 

In the initial design of L4 repeater a common-emitter—common- 
emitter—common-collector configuration”® was used in the power am- 
plifier. The third harmonic modulation performance was not as good as 
desired. This led first to the study of the output common-collector stage. 
As shown in Fig. 19, the increase in source impedance increases the 
distortion of the common-collector stage. Since the preceding common- 
emitter stage output impedance is high, the common-collector per- 
formance was not optimum. Secondly, the preceding common-emitter 
stage was studied because the gain of the common-collector stage is low. 
(see Section VI) As shown in Fig. 20, increase in load impedance beyond 
optimum R, degrades its performance radically. Since the common- 
collector input impedance is high, the common-emitter stage perform- 
ance was not optimum either. Thus, in the redesign work by Ken 
Tantarelli, the common-collector output stage is not being used. 

Another interesting application feature was the Improvement in 
modulation performance of the common-emitter stage with increase 
in voltage. As shown in Fig. 8, it is a function of load impedance, and 
at about 1500, maximum improvement was obtained. 

New coaxial systems are currently being studied to operate at higher 
frequencies. Different configurations have been examined for the output 
stage. The model showed that common-collector and common-base 
performance is poorer with an increase in frequency and thus the use 
of these stages as output stages was questioned (unless transistors with 


-120 





8 10 20 40 60 80100 200 400 600 1000 2000 
Rg IN OHMS 


Fig. 19 — Common-collector; M3 variation with source impedance; J, = 120 mA; 
Vee = 10V; R, = 759. 
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Fig. 20—Common-emitter; M3 variation with load impedance; J. = 120 mA; 
Vee = 10V; Ry = 75Q. 


higher f,’s are available). Recently when a new, high-frequency modula- 
tion test set was built, experiments confirmed the prediction. The third 
harmonic coefficient (173) for a + b — ec product was 8 dB poorer at 
36.5 MHz (due to signals at 36.5 MHz, 40.1 MHz, and 43.1 MHz) 
compared to the value at 17.3 MHz (due to signals at 14.5, 15.2, and 
16.6 MHz). The common-emitter configuration modulation performance 
suffered only about one dB degradation. 


VIII. CONCLUSION AND ACKNOWLEDGMENT 


This paper has presented a useful analysis tool for investigating the 
frequency-dependent nonlinear behavior of transistors. A digital pro- 
gram for all the three configurations has been prepared. The results 
obtained compare favorably with experimental results. The closed-form 
expressions yield a qualitative picture of distortion. The Volterra series 
proves useful in examining cascaded transistors; a few rules of thumb 
are derived and a general nodal analysis which can be extended to 
cascaded stages with feedback is developed. The practical applications 
cited show that the technique can be useful in the computer-aided 
optimal design of linear transistor feedback amplifiers. 

The author acknowledges gratefully the cooperation he received from 
Lee C. ‘Thomas who is responsible for much of the initial development 
of the model; he has been particularly concerned about the cancellation 
mechanism which he observed in the analog computer simulation. The 
author wishes to thank Mr. Jack Huang, who first suggested the model, 
Miss J. A. Nicosia, who wrote the program, Mr. F. Kelcourse for many 
useful discussions, and Mr. R. E. Maurer for reading the draft. The 
author would also like to thank Dr. F. H. Blecher for his encouragement 
and continued interest and guidance. 
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APPENDIX A 


A.1 Higher-Dimensional Transforms 


The second-degree case is illustrated as an example. From (4), 


y(t) = I | c(t — 11 ,¢t — Te) I] a(r;) dr; . (40) 


If the system is physically realizable, c.(¢ — 7,,¢— 72) = 0, for 7; > 1. 
Hence, the limits can be extended to o. 


wo =f i Seas eae TT 27.) dre. (41) 


Introducing dummy variables ¢, and t,, the two-dimensional trans- 
form is taken 


Y.(s; , So) = I | yo(t, , tb) exp (—S,t,) exp (—Sot.) dt, db, 


=a) at, | dt, || dr, | AT2 Colt on oe a ee T2) 
0 0 0 9 


TI x(7;) dr. | exp (—s,t,) exp (—Saé.). (42) 


t=) 


Substituting 4; — 7, = m,,t, — tT. = m,, and using the fact that 
Co(m, , M2) = O for m; < 0 yields 


Y.(s, , $2) = i. dm, [ dm. [ dr, [ AT: Co(M, , Ms)x(7;)U(T2). 


-exp (—s,m,) exp (—s,7,) exp (—S.m.) exp (—S8272) (43) 
= C,(s, , 82) X (8,) X (2). (44) 
A.2 The Output of the Kernels to Sinusoidal Inputs 


For the second-degree case, consider two sinusoidal signals at fre- 
quencies f, and f,. The input x(7) equals, 
_ | exp (wat) + exp (— jw.) exp (Jw,7) + exp (—jos7) 
AG Pe | sowecrerane= geek a= ae te eG 
(45) 
From (41) 


y(t) = y dr, / ar» Co(t — 71; ti— T2) 
0 0 
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| sp Giuers) exp. ( doer 4. exp (jours) + exp (<i) 
Z 2 


and 


Exp (jwaT2) + exp (—jJoute) , eXP (Jjo,72) + exp (—Jw,72) 
| xp Geers) + exp (joer) + exp. (iver) emp (owt) | (46) 


Considering one cross term only, 


i dr, | AT» Co(t em i ane t— To) 
Q) 0 


4 XP (Jwg7,) EXP (juz 72). (47) 


Substituting m, = t — 7,, m,. = ¢ — 7, and carrying out the integration 
yields 


£C2(jora j Jor) EXP [Ja + wr). (48) 


This term occurs twice as does its complex conjugate. 
Hence, the output due to the a + b term alone is 


Yosr(t) _ | C2 (Jwa 3 Jo») | COS [(wa + w,)t + Pas]. (49) 


The 2m, term and its conjugate occur only once in (46); hence, it is 
6 db better. The response of the third harmonic kernel to three sinusoidal 
inputs is similarly treated. 


A.38 Cascade Relations 


For the system shown in Fig. 1, the cascade formula are given below. 
The cascade relations can be symbolically written as 


Z= F(a) = Ey) = E(D{t]) = 4 o D)@). (50) 
The formula are 
Fi(s,) = £,(s,)D,(s:) | (91) 


F4(8, , 82) = Ey(s, + 82)D2(s, , 82) + F2(s1 , 82) 0 D,(s;) (52) 

F'4(s, , S2 , 83) = Ey(s, + 82. + 83)D3(s, , Se , 83) 
+ 2E.(s, , 82 + 83)Di(81)De(Se , 83) + E3(s: , 82 , 83) UJ Dy,(s;). (58) 
A physical interpretation of the formula for cascaded transistors is 
given in Section VI. The procedure for deriving the cascade relation is 


as follows: the output Z(t) of the last stage is expressed in terms of the 
Volterra series of its input. (Only two terms are considered) 
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Z(t) = I ae ees 

+f [Pet—n,t— 2) ued dr. (54) 
The output of the first stage y(t) is related to its input by 
u(r) = [alr = a)2(e) do 

+ f° [ate - 9,7 - 0) TL ate) ao, (55) 


Substituting (55) in (54), terms of the same degree are collected; as an 
example, the first second-degree term equals 


2 
fa ex(t = T) ll d,(r So Oy t — To) I] x(a;) do, . (56) 
t=1 
Taking the two-dimensional transforms yields 
2 
Ey (8, + 82)Da(s, , 82) II X(s;). (57) 


APPENDIX B 
The Nonlinear Parameters 
From (15), 
I, = f2)h(V cz). (58) 


A two-dimensional Taylor’s series expansion of (58) is taken; 7, is 
expressed by K(v.) and vez = vz — 0, . Hence, 


t. = go, V3 — 01) = gilve , v3 — 01) 
+ gee ,U3 — 01) + gave , Us — 01), (59) 
where 
gi(¥2 , V3 — 1) = MK yw. + M,(v; — 2), (60) 
go(Vo , V3 — 04) = a1, K2(v.)? + m(v3 — 04) 
+ a,M,Ko(v2)? + a,11,K,(v,)(v3 — 0), (61) 
and 
ga(V2 , U3 — 0;) = a3Mo(K,)*(v5)> + m3(v3 — 0,)° 
+ a,M.Kwo(v3 — 0)? + ay MT, K 702 (v4 — 1;) 
+ a,MK3(v2)° + 200M K,Ko(v2)> + a,M,K(v.) (v3 — v;). (62) 
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The avalanche coefficients are given below. 


“ 1 
a (63) 
Ver " 
= (Fe) 
M, = (M1) = oo (M,)° (64) 
M, = 3(I,)’ = 4 — 1) ih 4 Oe. (65) 











seat andl 2B) + BY] 


The coefficients m, , mz , and m, equal m; = Ig(M;/M,); 7 = 1, 2, 8, 
where I is the collector de bias current. 
The hrz coefficients are given below: 


ay = ae = a (67) 
herr mx + 1+ log’ Tr... a 2a log e log 7 








C max 


_ ok (a)° | i 
i) ey eae 2a log e| log 7 ce log e (68) 


ay 2a2 (a) _ I (a i: 
x | Pas Gy Gi ig ee } (69) 


The collector capacitance coefficients are given by 

















a3 > 


1 = k(Ver)? (70) 

v2 = k(Vosy® (71) 
k +5 

¥; = 27 (Ven) . (72) 


From (62) for g3(v2; v3 — v1), g3 18 obtained by replacing B,(s;) for 
v, and C,(s;) — A,(s;) for (v3 — v,); moreover, the kernel must also be 
symmetrical. Since the procedure is the same as for g, it is omitted. 
The interaction terms are given below: 


Jes = 202M K; B,(s1)B2(82 , 83) 
+ 2mez [Ci(s1) — Ax(S:)][Co(s2 , 83) — Ae(Se , 83)] 
+ a, MK, B2(s; , 82)[C1(83) — A,(ss)] 
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+ a,M,K, By(s:)[Co(sz , 83) — Aa(S2 , 83) (73) 
Son = Xe (Cis) — Bis ][Ca( , 83) — Bale , 8s) | (74) 


where—denotes symmetrical kernel. 


APPENDIX C 


A2486 is an n-p-n silicon transistor with overlay type of construction. 
Its fr ranges from 800 to 1000 MHz. It is a power transistor with current 
capability of 1 amp and can handle 2.2 watts of power. 

Typical parameter values for transistor type 2486 27 at 120 mA, 
10V are given below: 


I, = 0.12 amps. 

Tp = 13.6 ohms 

'. = 5200 ohms 

C, = (6)10°”’ farads 
Cy = (3.97)10~° farads 
C; = (9.2)10~”” farads 
Z, = 50 ohms 

Zy = 50 ohms 

Ves = 10 volts 

Veso = 350 volts 

nN = 2 

rey = ().2165 ohms 

a = 0.38 

hez max — 122 

Te max = 0.633 amps. 
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Coding for Numerical Data ‘Transmission* 


By M. M. BUCHNER, JR. 
(Manuscript received June 2, 1966) 


This paper considers the effectiveness of error-correcting codes for the 
transmission of numerical data. In such a situation, errors in the nu- 
merically most significant positions of a message are of greater con- 
sequence than are errors in the less significant positions. A measure 
of transmission fidelity based upon the average magnitude by which the 
numbers delivered to the destination differ from the transmitted numbers 
1s developed and 1s referred to as the average numerical error (ANE). Codes 
are compared by comparing the ANE that results from their use. 

Significant-bit codes are defined and the ANE resulting from their use 
is determined. For constant-symbol-rate transmission, the relative effect- 
weness of various coding schemes is analyzed when the error probability 
in the channel ts small, The ANE resulting from the use of certain specific 
codes is numerically evaluated and compared. 


I. INTRODUCTION 


The usual approach to coding is to ignore the actual meaning of the 
transmitted symbols and to represent them in a purely statistical 
manner. As a result, all message errors are assumed to be equally 
costly and codes have been sought that simply reduce the probability 
that a message is received in error. 

While this may.be appropriate for the transmission of some types 
of data, there are situations in which other criteria of goodness are 
of greater merit. If, for example, one is interested in the transmission 
of the temperature of a satellite, the probability that a particular 
observation is transmitted incorrectly may have little direct relation 
to system performance whereas a measure of the average magnitude 
by which the received data differ from the data actually transmitted 
could prove useful. 

* The material presented i in this paper is based upon the dissertation, Coding 
for Numerical Data Transmission, submitted by the author to The Johns Hopkins 


University in conformity with the requirements for the degree Doctor of 
Philosophy. 
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This paper develops a criterion of transmission fidelity for numeri- 
cal data transmitted over a binary symmetric channel based upon the 
average numerical error which occurs. Significant-bit codes are de- 
fined and the average numerical error resulting from their use is de- 
termined for a binary symmetric channel with independent errors. For 
constant-symbol-rate transmission, the relative effectiveness of various 
coding schemes is analyzed when the probability that a symbol is 
received in error is small. In order to obtain a feeling for the utility 
of coding, the average numerical error resulting from certain specific 
codes is numerically evaluated. 


II. PRELIMINARIES 


Throughout this paper, the channel is taken to include all operations 
performed upon the symbols during transmission. A binary symmetric 
channel is defined to be a binary channel such that 


(z) the channel always gives one of the binary symbols at its output, 
(2) the probability that any particular sequence of errors occurs is 
independent of the symbols transmitted. 


In some sections, we shall consider a binary symmetric channel with 
independent errors. This is a binary symmetric channel for which 
the errors occur independently with probability p where 0 S p S 4 
and p = 1 — 4g. 

The elements of the Galois field of two elements are denoted by 0 
and 1. Let the symbol G denote component by component modulo 2 
addition of vectors (or n-tuples) whose components are field elements. 
The set of all such vectors forms a vector space I of dimension n over 
the field of two elements. Because a field element can be viewed as a 
vector with one component, @ will also be used to denote the addition 
of field elements. 

A binary group code V is a subset of I which forms a group. Over 
the field of two elements, any set of n-tuples that forms a group is 
indeed a vector space. Therefore, a binary group code V forms a sub- 
space of I. The dimension of V is k. 

The implementation of a binary group code can be viewed in the 
following manner. The encoder receives & binary information symbols 
(called a message) from the source and determines from the message 
(n — k) binary parity check symbols (called an ending). The message 
and ending may be interleaved or transmitted sequentially to form a 
block of length n (called a code vector). The decoder operates upon 
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the blocks of n binary symbols coming from the channel in an attempt 
to correct transmission errors and provides k binary symbols at its 
output. The notation (7,k) is used to denote such a code. 

Consider the message (m,, ™Mz-1, *** , ™,). The code vector used 
to transmit this message will have m, , m,-1, -°* , m, 1n the k informa- 
tion positions. The (n — k) parity check positions that form the ending 
are denoted by é; , €2, ++ , €n,-,- Lhe order in which the information 
positions and the parity check positions are arranged for transmission 
is arbitrary. 

Let H denote the parity check matrix for a binary group code. H 
is an (n — k) X n matrix whose entries are field elements. An n-tuple 
v is a code vector if and only if 


vH = 0, (1) 


where H denotes the transpose of H. H can be written in a form such 
that each column of H that corresponds to a parity check position in a 
code vector is a distinct weight* one (n — k)-tuple. When this is done, 
let C,(1 S | S k) denote the column in H that is in the position that 
corresponds to position m, in a code vector. 

For a binary symmetric channel, the order in which symbols are 
transmitted can affect code performance. For the binary symmetric 
channel with independent errors, the order in which symbols are trans- 
mitted does not affect performance. In the latter case, we can write 
H as 


H oars (C;, ’ Ch-1 gees Ae Ore ee (2) 
where J,,_;,, denotes the (n — k) X (n — k) identity matrix. 


III. FORMULATION OF A CRITERION OF CODING EFFECTIVENESS 


A system for transmitting observations performed upon some physical 
process over a binary channel is shown in Fig. 1. So that the relation- 
ship between the observed numbers and the code will be clear, a general 
formulation will be presented. 

If each quantization step is of uniform size, the quantizer output 
can be represented as A -++ Bz where A and B are constants and the 
integer 7 indicates the quantization level. The ‘‘source scale-to-binary 
converter” receives A + Biz from the quantizer and transmits 7 to 
the encoder. The ‘binary-to-source scale converter’? receives some 
integer 7 from the decoder and delivers A + Bj to the destination. 


* The weight of a vector v is the number of nonzero components in v and is denoted 
by wiv]. The distance between two vectors wu and v is wlu @ v]I. 
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Fig. 1—System model. 


Let Pr {7 | 7} be the probability of receiving 7 at the decoder output 
when 7 served as the encoder input and let Pr {z} be the probability 
that 7 is sent. The average numerical error (ANE) that occurs is 


ANE = D0 D0 | (A + Bj) — (A + Bi) | Pr {7 | a} Pr {a}. 3) 


If all values of 7 are equally likely to be observed and if the range 
forzisO Si < 2° — 1, Pr {z} = 2. The range for 7 is thus0 Sj S 
2" — 1 and (3) becomes 


Br ok 
ANE = & > yi é| Prt tj | a. 
7=( j= 
Because B is a constant not dependent upon the particular coding 
scheme implemented, B may be set equal to 1 when comparing the 
effectiveness of different codes. Accordingly, we shall consider the 
expression 


gk_y oki 


ANE = laa eee. (4) 
For a specified value of k, a given coding scheme is considered perferable 
to some other coding scheme if the ANE resulting from the implementa- 
tion of the given code is less than the ANE resulting from the alternative 
code. 

The code enters (4) through the terms Pr {j | 7}. Thus, for a binary 
symmetric channel, the ANE will, in general, be dependent not only 
upon the error statistics of the channel but also upon the order in which 
the symbols are transmitted. 
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It is possible to simplify (4) to an expression that involves terms of 
the form Pr {7 | 0} exclusively. This reduces the number of terms by a 
factor of 2* and demonstrates that knowledge of the error probabilities 
conditional upon zero being sent is sufficient to evaluate the ANE. 
However, it is necessary to develop some notation and to present two 
lemmas before proceeding to simplify (4). The proofs of the lemmas are 
omitted because the lemmas follow from the group property of the 
code. 

When the integer 7 is to be sent, let us assume that the message 
ultilized is the k-bit binary representation of 2 (which is denoted by 
B(z)) such that 


B(t) a (mM, ’ My—~1 yo % yg m1), 
where 
; k-1 k-2 
1= m2 °° + miy:2 +--+ +m. 


The ending H; = (€,, €2, °** » €x-x) required to encode B(z) 1s chosen 
so that the resulting code vector C'(z) satisfies (1). 


Lemma 1: For any values of the integers i and j,0 < i < 2° — 1 and 
0<j S 2° —1, there exists an integer | such that Pr {j | 7} = Pr {1 | 0} 
where B(l) = B(t) ® Bj) andO0 <1 S 2 — 1. 


Lemma 2: Let B(l) = Bit) B BY) as in Lemma 1. For fixed 1(0 S 
t < 2° — 1), as j successively takes on the values 0, 1, 2, --- , 2° — 1,1 
takes on each of the values in the range 0 S 1 S 2" — 1 once and only once. 


Theorem 1: Let all messages be equally likely to be transmitted and let 
the channel be binary symmetric (but not necessarily with independent 
errors). For these conditions, the average numerical error ts 


k 2i-1 
ANE = >) 2?" 3° Pr {¢] 0}. (5) 
7=1 g=2Qi-1 


Proof: By Lemmas 1 and 2, for each value of 7 and for a specified 
value of 1, there will be a unique integer 7, such that Pr {j, | 7} = Pr {/| 0} 
where B(l) = Bit) @ B(j,). From (4), 


Qk—1 Qk-y 


1 ; 
ANE = 5¢ 2) De] ie — 4 | Pr {1/0}, (6) 
=1 i=0 
where we have used the fact that | 7, — 7 | = 0 when! = 0. 
For each value of 1 (1 S$ 1 < 2° — 1), we wish to determine 


eo |i — ti. Let a(0 S a S k — 1) be the largest integer such that 
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2° < lI. Define 7’ and 7} as 


Bi) = BW’) © BO") (7a) 
Bi) = BG) @ B*) or BY) = B@ @ BE"). (7b) 
Then 
BY) OB) = BW) © BG) = BY. (7c) 
Because 1 > 0,7 ¥ 7, and2’ ¥ 7! .Supposez > 7, . Then 7, = 7’ — 2° 
and jf = 4 — 2° by (7). It follows that 7, — 7 = —2* —2+ 7 and © 
je —-v = —2*° +7 —- 7’. Conversely, if 7 < 7,, 7, = 7 + 2* and 7 = 


a+ 2°. Thus, j, —7 = 2° —ti+d andy —7v = 2% 4+1-7. 

Therefore, 

Le ee | 2 ae as ee ae ee (8) 
But Biz) = B(2*) @® BY) @ B’) by (7). Thus, | 7 — 7’ | < 2° and, 
from (8), 

jx i) +H —e | = 2-2" 
Because of the symmetries involved, 
Qk—1 2k—1 
2Dli-étl= Dll + lH 7 | = 22" 
Thus, (6) becomes 


Qk—y 


ANE = >> 2° Pr {1] 0} 
l=1 


or 
k-1 Q2e+1~] 
ANE = >> 2° Pr {1 | O}. QED 
a=0 1=20 


In (5), notice that Pr {0 | 0} does not appear and that the terms Pr {7 | 0} 
are not weighted linearly in 7 but that the weighting coefficients go 
in steps as powers of 2 with several conditional probabilities having 
the same weighting coefficient. Notice that the weighting coefficient 
for Pr {z | 0} is 2’-* where (j — 1) is the largest power of 2 in z. All errors 
with the same coefficient are of the same seriousness and a good code 
must reduce these sets of probabilities rather than simply minimize the 
probability that a few very large errors occur. 

Because the set of messages B(z) (2° S$ i S 2’ — 1) gives rise to 
the set of conditional probabilities whose weighting coefficient in the 
ANE expression is 2*~*, we shall call these messages the j-level messages 
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and the corresponding conditional probabilities, Pr {2** | 0} through 
Pr {27 — 1 | 0}, the j-level conditional probabilities. The 0-level message 
is defined to be B(O) and the 0-level conditional probability to be 
Pr {0 | 0}. 

The j-level messages have the following interesting characteristics. 


(2) Component m; in each message is 1. 
(22) Components m,;(j + 1 S 7 S k) in each message are 0. 
(2722) Kivery possible (7 — 1)-tuple occurs once and only once as com- 
ponents m, through m,;_, of some j-level message. 


For a perfect error-correcting code used with a binary symmetric 
channel with independent errors, it is possible to compute the j-level 
conditional probabilities and thus the ANE from a knowledge of the 
weight distribution of the code vectors on each level (these weight 
distributions have been referred to as level weight structures.)' The 
problem of efficiently computing the level weight structures from knowl- 
edge of the parity check matrix has been discussed previously.’ 


IV. SIGNIFICANT-BIT CODES 


In order to permit the error-correcting capabilities of a code to 
correspond somewhat to the significance of the information positions, 
it is possible to formulate a type of code which uses a subcode to protect 
the (k — k,) most significant positions of a message and simply transmits 
the remaining symbols unprotected. The name significant-bit code 
(SB code) is used for this type of code. An SB code is specified by the 
parity check matrix H., and the ANE resulting from the use of an 
SB code is ANE sg, . 

The code utilized to protect the (k — k)) most significant informa- 
tion positions will be named the base code. Because it is confined to 
the (k — ko) most significant positions, we can abstract the base code 
and study it as a separate entity. Accordingly, the base code vectors 
are (n — ky)-tuples of which the first (k — k,) positions are the base 
messages. 

Although the concept of SB codes is applicable to any binary sym- 
metric channel, we shall assume independent errors in the following 
analysis. Thus, from (2), the base code is specified by the base parity 
check matrix Hg, where 


Hy = (Chze AG es romeo Cia). 


In this case, the code vector C(i) = B(z) | EH, where the symbol | 
indicates that C(z) can be partitioned into the k-tuple B(z) and the 
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(n — k)-tuple LE; . Let B(z) be partitioned so that B(z) = B’(2’) | B’(7’’) 
where B’(2’) denotes the (k — ky) most significant positions of B(:) 
and B’'(2’’) denotes the k, least significant positions of B(z). Then 


Ci) = BiG’) | BU’) |B: 


The range for 7’ isO S 7’ S 2°°** — 1 and for?” isO S 7” S 2" — 1. 
Let Pr, {2’ | j’} denote the probability of receiving 7’ when j’ is sent 
using the base code. By Theorem 1, the ANE for the base code (ANE;,) is 


k—ko oi-] 
ANU =: D2" D3 Pig ao}. (9) 
j=1 i’ =2Qin1 
Because the base code is used exclusively to protect the (k — kp) 
most significant information positions, Hs, must have the form 


A sx == Ca ’ Ch-n-1 yp 7 Ty C3 ’ Ci 0 ee 0 Li) 
ee 


ie 
columns 


where 0 is used to represent an all-zero column of Hs, and where 
the C/(1 <= 1 S k — k,) are the columns of H,. The coset leaders” 
in the standard array” for the SB code must be obtained from the 
coset leaders in the standard array for the base code by expanding the 
base coset leaders in length to n-tuples by inserting k, zeros in informa- 
tion positions 1 through k, of the expanded vectors. Because all vectors 
in column 7 of the standard array for the SB code will have B’’(7’’) 
in information positions 1 through fy , 


Pei 0) Sag og ee pe nO, (10) 


We shall now show that ANI.s, can be expressed in terms of the 
properties of the base code. 


Theorem 2: Let the base code be defined as above. For a binary symmetric 
channel with independent errors and when all messages are equally likely 
to be transmitted, 


ko 
ANEsan = Pry {0 | 0} D> 27 -'pg*?? + 2° ANE, . (11) 
j=1 


Proof: Define 


ko 


ANE’ = 5527 >° Pr {7 | 0} 


j=l q=2i!7-1 
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and 
k Qi-| 
ANE” = >) 2’ >° Pr {i ] 0}. 
j=kot+l 4=27—-21 


I'rom Theorem 1, ANEs, = ANH’ + ANE”. 
Let us first analyze ANE’. For 1 S 7 S k), the sum of the 7-level 
conditional probabilities is 


2i-1 


yi] 
» Pr {4 | 0} _ » poe Oe Lr; {0 | 0} 
i=Qin2 itt egint 
where we have used (10) and realized that 7’ = 0 for all messages on 
this level. Because every (7 — 1)-tuple occurs as components m, through 
m;-, of some j-level message and m; = 1 in every j-level message, 


there are 
( ) a 
w(B’'(0’")] 1 


messages of weight w|B’’(2’’)] on the j-level. Thus, 


= | RS a ae 
y Pr {¢]0} = Prs (0/0) d ( : Nya" | 


gaoTnml t=1 
= Pr, {0 | O}pg"’ 


and 


ko 
ANE’ = Pry {0 | 0} D0 27 'pgh". 
j=l 
Now consider ANE”. On level & + € A S § S k — ky), 7 has the 
range 2°°**"' < 7 < 2*°** — 1, Divide this range into 2*~' sets of con- 
secutive integers each of size 2°°. Let the integer 6 index these sets 
where 0 < 6 S 2*"' — 1. For a particular value of 6, as 7 increases 
from 2%°*®! +. 62% to QeetF-! + (6 + 192" — 1, 7? = 2®-? + 6 and 2” 
runs through the range 0 S 7” < 2*° — 1. Thus, using (10), 


2kotE—14 (541) 2ho-1 


Pr {2 | 0} 


g=Qkot+tE—-14 59k0 


2ko—j 
reggae vane Bleqgee - 
= a pr» (i ge w | (t’")} Pr, {oF 1 +. 5 | O}. 
i’’=0 
As 7” runs through the range 0 S 7” S 2°° — 1, each possible kp- 
tuple occurs once and only once. Therefore, 
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gkott-14(§41)2ko_-] ko k 
oF Pr {7 |0} = Pra {2°' + 6|0} >> (He) ptgte 
t=QkotE—14 §Qko L=0 


(12) 
= Pr, {2°" + 6/0}. 


Because of the manner in which the sets were chosen, ANE” can be 
expanded as 


2E-1—1 QkotE—14(541)Qko—]} 


k—ko 
ANE” = D7 2*F? OT pS Pr {z|O}. (13) 
g=1 


6=0 goQkotE—14$9k0 


Substituting (12) into (13), we obtain 


k—ko 2t-~1 
ANE” = 2" > 9°"! 2 Pry {2’ | 0} 
t=1 i/=2F-1 
which, from (9), is exactly 2°° ANE . QED 

Notice that the situation k = k, can be included in this formulation 
if we define ANE, = 0 and Pr, {0|0} = 1 whenk = k, . Thus, uncoded 
transmission can be regarded as an SB code in which k = ky. 

The interpretation of (11) is interesting. The quantity >>", 277-"pq" 
is the ANE that results from the uncoded transmission of ko-tuples. 
Thus, ANI. gz is the ANE for uncoded transmission of ko-tuples weighted 
by Prz {0 | 0} plus 2”° times ANE, . 

(11) enables the computation of ANEgs,; from the properties of the 
base code. Because the base code involves messages of length (k — ky), 
it is easier to analyze than the entire SB code. 


=4 


V. CONSTANT-SYMBOL-RATE TRANSMISSION 


Consider two error-correcting codes which are denoted as V, and 
V.. Let V, be an (m,, &) code and V. be an (n,, k) code where n, 
may or may not be equal to n,. Let ¢, denote the minimum weight 
of the n,-tuples that are not coset leaders in the standard array for 
V,. Similarly, let «, denote the minimum weight of the n.-tuples that 
are not coset leaders in the standard array for V,. 

For a binary symmetric channel with independent errors, Pr {z | 0} 
for V, is 


na 


Pr {7 | 0} zs pa Tip’ nis 


7=€1 


where 7,;; is the number of n,-tuples of weight j in the column headed 
by C(z) in the standard array for V,. Thus, for V,, the average nu- 
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merical error (ANE,) is 


nt 


ANE, = Diop’, 
where 


qt—1 


=o » Tiz 


q=-2!-1 


Similarly, for V, , 


ANE, = 7 yip'q", 
where the 7; are the appropriate constants. 
However, 


ANE, — c,,p"*¢" * as p—O 
and 
ANE, > y,,p°¢" ”" as pO. 


Thus, for p sufficiently small, if e, > e,, ANE, < ANE, and V, 
results in less ANE than V,. 

The minimum weight of the vectors that are not coset leaders in an 
SB code is 1. Thus, consider two SB codes denoted by V sp; and V gp 
where V sp; 1s an (n, , k) code and V spo 18 an (N2, k) code. V gp protects 
the (k — ko,) most significant positions and V gg2 protects the (kK — ko.) 
most significant positions of a message. By reasoning analogous to 
that above, for p small, if kp, < ko2 and if the base codes used in V gp, 
and V spe correct all weight one errors, then V s,, results in less ANE 
than V spe. 

We thus have the following ranking of codes for p small. The ranking 
(in order of increasing effectiveness) assumes that the schemes are 
compared for the same value of k. 


(t) Uncoded transmission. 

(it) An SB code protecting (k — ko) positions where k # ky. 
(277) An SB code protecting (k — ky + k’) positions where k’ > 0. 
(iv) An e-error-correcting code where e 2 1. 

(v) An (e + e’)-error-correcting code where e’ > 0. 


To obtain a feeling for the utility of coding for numerical data trans- 
mission over a binary symmetric channel with independent errors, the 
ANE resulting from certain codes for k = 26 will be evaluated for 
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constant-symbol-rate transmission. Ref. 3 contains similar information 
fork = 1,4 and 11. 

Let ANEyc denote the ANID when no coding is used. Contrary to 
the concept of code equivalence that is obtained under the assumption 
that all errors are equally costly (i.e., when probability of message 
error is used as the measure of code performance), the ordering of the 
columns of the parity check matrix can affect code performance. Thus, 
for the (31, 26) perfect single error-correcting code (PSEC code), every 
ordering of the columns of the parity check matrix could yield a distinct 
ANE. Upper and lower bounds on the ANE for this code are obtained 
in Ref. 3 and are denoted herein as ANEys; and ANE;z , respectively. 

By numerical computation, the ordering in (14) was found to result 
in as small an ANE as any other ordering tried. The number actually 
tried was by necessity a small fraction of all possible orderings of the 
26 columns. However, notice that C,. through C2, each have a one in 
the same position thus assuring us that the number of weight three 
code vectors on levels 12 through 26 will be the theoretical minimum for 
this code (by Theorem 9 in Ref. 3). For values of p that are of primary 
interest (less than 107°), this assures us that it is not possible to find 
a different ordering that will result in a significantly better performance 
(although there are other orderings that in fact give equal performance). 
Let ANEp denote the ANE that results from the code specified in (14). 


LIL11111111111100000000000 
LLI11111100000001T111110000 

Ap = )111100001111000111100011107Z;)- (14) 
L1IDODLLIOOLLOOILOLIOOIIOILIIO!LI 
LOLOLODOLOLOLIOLIOLIIOLIOLOLIOII 


If the columns of (14) are regarded as the 5-bit binary representations 
of integers, then the ordering from left to right corresponds to decreasing 
integer value (with powers of two omitted because they appear in J;). 
Similar ordering was observed to be preferable for the (15, 11) PSEC 
code’ and, by exhaustive search, actually found to be as good as any 
other ordering for the (7, 4) PSEC code’. 

Table I compares ANE;; , ANEyg, and ANE,. For convenience 
(and so that the values given will agree with the data plotted in Figs. 
2, 8, and 4), the ANE has been normalized by dividing by 2° — 1 
(i.e., the full-scale value). 

The following SB codes are considered. For each, Hy and the nota- 
tion used for the resulting ANE in Figs. 2, 3, and 4 is given. Theorem 
2 permits the computation of the ANE for these codes from a knowl- 
edge of the base code. 
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TABLE 1— VALUES OF ANT, , ANKy, AND ANIOp 
DivipEp By 2°" — 1 


Pp ANE, ANEp \NEuvp 
10-6 0.41992-1078 0.41994-1078 0.42993 -1078 
10-4 0.41920-107® 0.419381-107° (0). 42932-1076 
10-3 0.41212-1074 0.41310-10~ (). 42329-1074 
10° 0.34850: 1072 0.35659 -10~ 0.36894: 107? 
107} 0.90222-1071 0.10446 0.12817 
Base Code 1: (8, 1) PSEC code. 
1 
7 | 1, 
B i 2 


The ANE is denoted as ANE,3,1) . 
Base Code 2: (5, 1) perfect double error-correcting code. 


Hy = I, 


pemmpmkfh feeed 


The ANE is denoted as ANE¢s 1 . 


Base Code 3: This base code uses independent (38, 1) PSEC codes to 
protect the two most significant information positions. 
1 0 
1 0 
las 01 i; 

0 | 
Because the codes are used independently, the required conditional 
probabilities for the base code can be readily calculated. The ANE 
is denoted as ANE¢s3 31) ,¢3,1) - 


Base Code 4: (7, 4) PSEC code. 


7 





Ay oa 


ao 
oe 
—- Ce 
—A tC 


L 
The ANE is denoted as ANE v7.4) . 


Base Code 5: This base code uses a (8, 1) PSEC code to protect the 
most significant information position and a (7, 4) PSEC code to protect 
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the next four most significant information positions. 


A= 


The ANE is denoted as ANE 31) ,¢7,4) - 
Base Code 6: (15, 11) PSEC code. 


11ilt1l1lid00o 
11110001110 
Hei 11001101101 ¢ 


LO1O01011011 


The ANE is denoted as ANEa5 11) . 

Figs. 2, 38, and 4 present ANEyo , ANEys; , ANE,: , ANE,p, and 
the ANE of the SB codes considered. In each case, the ANE has been 
normalized by dividing by 2°° — 1. For clarity, logarithmic scales are 
used as p decreases from 107° until p becomes sufficiently small so 
that the results for small p apply. 

The following observations can be made for constant-symbol-rate 
transmission. 


(2) Improvements in transmission fidelity are obtainable by the 
utilization of codes. It should be noted that no one code is the most 
desirable for all p (0 < p < 4) and in some cases the codes that are 
best for small p turn out to be less effective than uncoded transmission 
for the larger values of p. 

(21) For k = 26, it can be shown that the probability that a message 
is received in error when the PSEC code is used is less (for 0 < p < 3) 
than the probability that a message is received in error using any of 
the SB codes considered. Thus, under the criterion of minimizing the 
probability that a message is received in error, the PSEC code is pref- 
erable to any of the SB codes considered. 

However, when the ANE is used as a measure of code effectiveness 
for numerical data transmission, we observe that the SB codes are 
preferable to the PSEC code for certain values of p. Thus, when com- 
paring codes, the ranking obtained using probability of message error 
as the performance index may not correspond to the ranking obtained 
using ANE as an index. We can conclude that probability of message 
error and ANE are not equivalent measures of code performance and 
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ANE /226-{ 


/ 
Pf 
“ANE (3 4) (7,4) 

“~~ ANE , 


3,1) 





p 


Fig, 2—— Constant-symbol-rate transmission; k = 26. 


that, in some cases, the ANE can be reduced by using a code whose 
probability of message error is not minimal. 

(71) For k = 26, consider the relative performance of the PSEC 
code and the SB codes. When p is small, the PSEC code will be effective 
because it can correct all single errors (the only type that have much 
probability of occurring) whereas a single error in certain positions 
of an SB code will result in a message error. For larger values of p, 
there is an increasing chance that an error pattern will occur which the 
PSEC code cannot correct. The SB codes become effective in this situa- 
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tion. If multiple errors occur during transmission such that the errors 
occurring in the (k — ko) most significant information positions and the 
check positions form an error pattern correctable by the base code, 
this will be corrected leaving any errors in the ky least significant 
information positions uncorrected. Therefore, the most costly portion 
of a large number of error patterns can be corrected. As p increases, 
the number of positions in the base code must decrease so that un- 
correctable error patterns in the positions covered by the base code 
have a sufficiently small probability of occurrence so that the base code 
can operate effectively. In other words, as p increases, more and more 
protection must be provided for the significant bits so that the most 
costly errors are prevented. 


ANE, p : CANE (3,1) 


/ 
/ 
/ 


Y iy ~~ ANE (3 4) (7,4) 
/ 

UY 

VA ~TANE (15,11) 





Biot 


Fig. 3 -— Constant-symbol-rate transmission; k = 26. 
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2 4 6 8 
1073 


Fig. 4— Constant-symbol-rate transmission; k = 26. 


(wv) For p small, the ANE from uncoded transmission is approxi- 
mately (2° — 1)p. For small p, the ANE as a fraction of full scale for 
uncoded transmission is thus very nearly independent of k. 
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Realizability Conditions for 

the Impedance Function of the 

Lossless Tapered Transmission Line— 
A Critique 


By E. N. PROTONOTARIOS 
(Manuscript received March 6, 1967) 


I, INTRODUCTION 


In a recent brief’ in the B.S.T.J., Zador presents, without proof, 
realizability conditions for the input impedance of the lossless tapered 
transmission line terminated in unit resistance. Upon a careful examina- 
tion of the brief, it appears that the conditions are not accurate. The 
following analysis clarifies this point and, incidentally, provides alterna- 
tives to Zador’s necessary conditions. 

Consider a nonuniform line (Fig. 1) with inductance per unit length 
£(x) and capacitance per unit length C(x) such that (to follow Zador) 


L(x)C(x) = 1. 


Let V(a,s) and I(z,s) be the voltage and current along the line with 
polarities as indicated in Fig. 1. The equations of the line are 


dV(z,s) 
ae s£(x)I(x,s) 
dI(x,s) 
we s(x) V(ax,s). 
Eliminating J(z,s) and taking into account that £7) = 1/C(x) we get 


‘ (eq) wea) = s C(x) V(z,s). 


Note also that 


_ @@) dV(z,s)_ 
dx 


Hence, we can identify Zador’s y(z,s) and c(x) with V(z,s) and @(z), 


respectively. From the reference polarities of the voltages and currents 
in Tig. 1, we see that for a unit resistance termination at x = 0 we must 


1047 


I(a,s) = 
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C (x) 


I (x,S) 





Fig. 1.—Lossless tapered transmission line. 


have 
V(0,s) = —J(0,s). 
Hence, if we impose the condition (following Zador) 
y(0,s) = V(O,s) = -2 
then for unit resistance termination we should have 
dV(0,s) _sV@O,s) as 


—_ _—— 


Os) = a = 60) a0)’ 


The driving point impedance, for any termination, should read 





_ _s_ yllys) 
49) = oH y'(is) 
Thus, the signs are wrong in Ref. 1. This is not the crucial error however. 
In this paper, we will show that the difficulties in Zador’s paper 
arise from the following facts: 


(1) He does not consider the matched line. Unmatched lines tend 
to have almost periodic behavior for large real frequencies and hence 
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the network functions do not have limits at infinity. This point will 
be made more precise in the sequel. 

(it) Multiplication of Z(jw) by exp (~—2jlw) in property (277) of the 
necessity statement introduces periodic behavior at infinity even in the 
matched case. 

(4722) Physical meaning has not been attached to the N; and D,. 
These should obviously be identified with the well-known ABCD 
parameters to correct (27) of the necessity conditions. 


II. COMMENTS ON ZADOR’S BRIEF 


Property (222) in the necessity statement does not appear to be true 
as stated. One can easily construct many counter examples. 


Example 1: The uniform line with (following Zador’s notation) c(z) = 1 
and length 7 = 1, terminated in a 1-ohm resistor. Obviously c(x) satisfies 
the conditions stipulated by Zador, i.e., c(x) is positive and continuously 
differentiable in the interval 0 S x S 1. Clearly the driving point 
impedance is 


Z(jw) = 1. 
Therefore, 
f@) = Re exp (—2jlw)Z(jw) = cos 2w. 


Clearly cos 2w does not have a limit fora — +o. 
Consider now a less trivial counter example. 


Example 2: The exponential line terminated in a unit resistance. With 
Zador’s notation c(z) = exp 27, and! = 1. In this case by solving Zador’s 
(1) with the subsequent boundary conditions (appropriately corrected) 
we find 


A(jw) + Bw) 





40) = Cis) + D(a) a 
where 
A(jw) = 1 Sc Vor — 1+ wet (2) 
Bin = Fe | a 
05a) = ofju i eH (a) 
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D(jw) = 08 Vo —-1- nye sin Ve = 1h, (5) 


It turns out that 


: 1 1 
se te AQ) a a (6) 
; .BD— AC 2u sin’ 7-1 
X = Im Z (jor) =-—-J 7 -_ @ = — Kole = a 3 (7) 


where 


K@) _ wy” — w’ sin” Vw — 1 
a = {00s Vo —1- eae sin -ve = It Don eg (8) 


Hence, 


fw) = Re exp (—2jw)Z(jw) = R cos 2w + X sin 2w 


20) sin” Vor = 1 — 1sin 2a |. 


= Ap | cs 20 — (9) 
wo — 1 


K() 
Obviously f(w) does not possess a limit forw—> +o. 


Example 3: Consider now the class of transmission lines which have 
a, positive bounded and twice differentiable c(z) in the interval 0 S x S l. 
It can be shown (see e.g., Ref. 2) that the ABCD parameters satisfy 
the following asymptotic relations, for w large:* 


A(jw) = 20 cos to + o(2 (10) 


— . sin lw 1 
BG) = j Sale + olf) (11) 
C(jo) = 7-Vc(0)c(1) sin Iw ++ o(4) (12) 
7. c(1) 
D(jw) = 2(0) cos lw + o(t ) (13) 


These results follow from the classical theory of the asymptotic 
behavior of the eigenfunctions of Sturm-Liouville problems.’ The 
WKBJ method is a related subject. Schelkunoff has discussed these 


* The line is driven at the point z = |. The product of the inductance per unit 
length and the capacitance per unit length is assumed to be unity. 
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matters in an elementary way in at least one of his textbooks (he does 
not include the O(1/w) term). 

If the line is terminated at « = O with a resistance Ry, we have for 
the driving point impedance 
RoA(jw) + BUjw) | 
RoC (jw) nae o, (joa) 


Substituting from (10), (11), (12), and (13) we find that for large w 


Roc(0) (1 — Réc’(0)) sin lw 1 
A(ja) = c(L) E TJ J Roe(0) { cos ley + 7Roc(O) sin lw} T o(2) | (15) 


and 


Z(jw) = (14) 





7 ., _ R,c(0) (1 — Ric’(0)) sin® lw al 

ea ae r T T—@ — R70) sin? lo + o(2) | (16) 
7 | (i — R%?(0)) sin 2lw 1\ 

A ean a (1 — ze — Reoc"(0)) sin® lew] a5 o(4) (17) 


Hence, if R,c(0) ¥ 1, Zw), Re Z(jw), and Im Z(yw) do not have limits 
for w —> +o. 

Similarly, fw) = Re exp (—2jlw)Z(jw) does not have a limit for 
w — +0, When R&,c(0) = 1, 1.e., when the line is “locally matched”’ 
at « = 0, we have 


Z(jw) = "a ++ (2 (18) 
R= ReZos = a *: o(+) | (19) 
eee eA Cee o(4). (20) 
In this case, 
fa) = Re exp (—2jle)Z(je) = a cos 2lko + (2 ). (21) 


Clearly, f(w) does not have the asymptotic behavior stipulated by 
Zador; it does not even have a limit (because of the cos 2/w term). 
Note that the asymptotic formulas (10), (11), (12), and (13) are 
also valid for a continuous positive c(x) which is piecewise twice dif- 
ferentiable. This can be proven by partitioning the line at the dis- 
continuity points and finding the overall ABCD matrix by multiplying 
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the ABCD matrices of the sections of the line which now have a twice 
differentiable c(z). 

Hence, property (277) of Zador’s necessity statement could be replaced 
by the following: If (z) c(z) is a positive continuous and piecewise 
twice differentiable function of the real variable x, (27) the line is term- 
inated in a unit resistance and c(0) = 1, then the following relation 
is valid for large w: 


Be = 7; + (4). (22) 


Another substitute will be discussed in the following. Let p(jw) be the 

voltage reflection coefficient at x = / for the unit resistance terminated 
line, then 

1 1 + p(y) 

Z ei esaeeante bla 

= oH 1 — lie) 

Tor a c(v) which is continuous and twice differentiable in the interval 

0s 28 /with 





(23) 


c(0) = 1 


dcQ) — de) _ 
dx du 


(24) 
0 


we can see, using Schelkunoff’s results on wave propagation in stratified 
media,’ that for w large 


ee 1 : 
p(jw) = o(4). (25) 
From (23) we have in general for | p(jw) | < 1 
ae 1 a os 
Z(jw) = “D {1 + 2p(jw) + 2p°(jw) + --+}- (26) 
Hence, using (25) we get 
ns) = + 4 o(4) @7) 
el c(1) w 


for large w. 

To generalize (following Schelkunoff’) if c(0) = 1 and the first n 
derivatives of c(x) are continuous functions of z and vanish at the 
boundaries then for large w 


(ja) = o(4,) (28) 
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and therefore, 
ais) = 4. + ota): (29) 
Property (zz) in the necessity statement of Zador is also wrong. 


Proof: The input impedance of the unit-resistance terminated line may 
be written, in terms of the ABCD parameters, as follows: 


a A) + BO _ QO) 
“9 = 66) E D@ * PO’ (80) 


Consider a line with a twice differentiable c(z). In this case A(s), 
B(s), C(s), and D(s) are entire functions of order 1 and type I (see 
Ref. 2), 1.e., 


A(s) & ce"* 
B(s) & ee" (31) 
C(s) & c3e"* 


D(s) & ce" 


(where ¢,, C2, C3, C, are positive constants) for real s — +0. Note 
also that 


A(s) = A(-—s) D(s) = D(-s) 
Bis) = —B(-s) C(s) = —C(—s) 


(32) 


and 
AB—CD=1. (33) 


In order to find Zador’s representation with the N;, D; (¢ = 1,2) 
functions we should be able to find an entire function ¢(s) # 0 such 
that when we multiply both the numerator and denominator of Z(s) 
in (30) by this entire function, we get functions V;, D; (@ = 1,2) with 
the properties stipulated by Zador. 

We will have 


N,() = Ev (@e@] = A@ PL HLEK® 4 pM —EO9 ay 


NA) = Odd (Qe) = Aw) PE + Be oi) +9 (35) 
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Similarly, 
D,(s) = D(s) y(s) Helos) ca O(s) ¢(s) ee9 (36) 
D,(s) 2 D(s) (Ss) sats tte C(s) g(s) Tee (37) 
Hence, 
N,(s)D,(s) — N2(8)D2(8) = ¢(s)e(—8). (38) 


From (34), (85), (36), and (37) it follows that the functions (g(s) + 
y(—s))/2 and (g(s) — e(—s))/2 should be of type O in order that 
Zador’s N,; and D; be of type J. Consequently, the functions ¢(s) and 
¢(—s) themselves are of type 0. Therefore, it is impossible to find 
an ¢(s) such that o(s)v(—s) = exp 2ls as Zador stipulates. So property 
(12) in Zador’s necessity statement could be replaced by 


ND, = N.D, = ke? , 


where k is a constant. Then NV, , D; (¢ = 1,2) are proportional to the 
ABCD parameters with proportionality factor k. 

From the above it follows that the sufficiency part as stated is in- 
accurate. It might be possible to alter the sufficiency conditions to 
make them valid. In this case a proof must be given. The author has 
done related work® on realizability conditions for nonuniform RC lines 
and is familiar with the difficulties involved in proving sufficiency 
conditions of this form. 

Finally, Zador’s conjectures do not have an obvious physical in- 
terpretation and hence they should be justified. 
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